NEWS

EVENTS / January 30, 2025

Spin Digital Labs at ITU Workshop On Future Video Coding



Spin Digital Labs participated at the ITU and ISO Joint ​Workshop on “Future video coding – advanced signal processing, AI and standards” in Geneva, Switzerland, on 17 January 2025.

As the ITU and ISO (MPEG) standardization bodies are preparing to work on their next generation video codec standard with capabilities beyond VVC/H.266, this workshop brought together multiple speakers from different parts of the video industry to discuss experiences, needs and requirements from the industry.

Session 3: Practicalities: Hardware capabilities and software implementations

At Spin Digital Labs, we were honored to be invited to participate in this event, our COO Dr. Mauricio Alvarez-Mesa contributed with a presentation titled “Implementing a VVC software live encoder: lessons learned and looking ahead”.

In this presentation Mauricio discussed the main lessons learned from implementing a VVC live encoder for UHD applications and how they can be taken into account when designing a next generation video coding standard.

As described in the presentation, Spin Digital’s VVC live encoder can achieve around 20% compression gains compared to an optimized HEVC encoder. This is below the reported 40% bitrate gains of VVC reference software, but considered beneficial for users in live video applications.

The analysis shows that VVC performance is limited by complexity and cost. In previous video coding standards, for example when comparing HEVC to AVC, it was possible to achieve significant bitrate reductions at the same complexity. But with VVC bitrate reduction is only possible with complexity increases. The above mentioned 20% reduction comes with a computational cost of 1.7x to 2.0x compared to HEVC. Further increases in compression require even more computing power, which can result in high cost or simply in not practical design for live applications.

The limitation of complexity and cost is related to the challenges of implementing software video codecs using modern CPU architectures. High performance is provided in recent CPUs mostly in the form of parallel processing using multicores as single thread performance improvements are very limited. Using many cores efficiently for video encoding results in a complex tradeoff between parallelism, compression, latency and quality. If the video coding tools can be implemented using multithreading and effectively mapped to multi-core architectures then it is possible to increase performance and compression efficiency.

In VVC parallel processing is possible up to a certain extent. Spin Digital implementation has shown that it is possible to create real-time software encoders for 4K and 8K using a CPU-based design. But there are some coding tools, called RDO intensive tools for the purposes of this presentation, that rely on single threaded performance and therefore limit th the parallelism and cannot be used for live encoding scenarios.

These tools apply Rate-Distortion-Optimization at the sub-block level with a complex sequential evaluation, making parallel processing opportunities very limited.

In the case of VVC there is a balance between tools that can be implemented using parallel processing for live encoding, and these sequential RDO intensive tools. But if the next generation video codec continues the trend of more reliance on complex RDO extensive coding tools, then we anticipate that practical compression gains for real-time software encoders will not be significantly better than VVC. This will result in a slower adoption than VVC, as the use cases will be more niche (.e.g the next codec can only be used for Highly viewed VoD content)

Requiring more computing to achieve higher compression can still be effective if we look at complexity in a different way. Having many small RDO intensive coding tools, as discussed, is not the way forward as this limits parallelism, increases cost, and reduces the application space of the new video codec. But having video coding tools that correlate well with video properties, for example based on image segmentation and object recognition, the increase of encoder complexity can be manageable as these algorithms map well to parallel CPU (and GPU or NPU) architectures.

As some of the other speakers mentioned in the Workshop, a new generation video codec cannot be focused only on achieving compression at any (compute) cost. There should be a good match between the potential offered by new coding tools and their potential implementation on parallel architectures.

The future of video coding will require even more collaboration between video coding experts, parallel computing engineers, human vision science, and related fields. These are exciting times for engineering and innovation.

More information

  • SPIN DIGITAL LABS 2024 COPYRIGHT