Since the inception of Deep Render, I've had the privilege of sharing our company's vision and the transformative potential of AI-based video compression on numerous occasions. However, I feel there is always more to explore and discuss about this fascinating technology. Accordingly, in this blog post, I will be delving into some of the aspects of AI compression and the world of video that deserve greater attention.
In Lewis Carroll's Alice in Wonderland, Alice stumbles upon a potion marked 'Drink Me' and, upon taking a sip, experiences a transformation. She shrinks down to a fraction of her original size, embarking on an adventure in a surreal realm with bizarre-looking rules. Eventually, Alice returns to her usual size, forcing her to leave behind this fantastical dreamland. While Alice's adventure was a dream, in the world of online videos, a similar transformation is an everyday reality.
Every online video we enjoy embarks on a similar journey. Each video begins as a raw file, undergoes a process of shrinking in size (compression), and transmission through the intricate network of fiber optics and the digital landscape. Finally, the shrunken file arrives at its destination, leaves the fiber optic network realm, and returns to its original size (decompression) for viewing.
The true magic here is the art of video compression, which seemingly achieves the impossible: shrinking video file sizes to the smallest possible footprint while preserving high-quality video content.
Consider an uncompressed, 4K video running at 30 frames per second. In its raw form, it consumes 2,700 gigabytes of data per hour. To enjoy this video, one would need a 6,000 Mbps internet connection — a digital super-highway, impossible in today's world.
So how is it possible to watch this video? Video compression of course!
Video compression has existed for decades, with improvements in performance over time as the field matured. The latest step change is Deep Render, the world’s first AI-only video compression codec. With Deep Render, that same 4K video running at 30 frames per second is transformed into consuming only 2.7 gigabytes of data per hour. That only requires a 6 Mbps internet connection - something most households have readily available - all while maintaining the same visual quality.
Let’s summarise those figures:
It’s a reduction so significant that it feels nothing short of magical: a digital elixir akin to Alice’s ‘Drink Me’ potion.
In the ever-evolving world of technology, disruptive innovations like AI-based compression have the potential to redefine entire industries. However, there is a fundamental question that follows any innovation: how long will this innovation continue to develop? Will it plateau, or will its rapid development and growth continue?
For Deep Render, understanding the trajectory of AI-based compression is essential for setting our strategy as a business. Two factors come to mind when considering potential trajectories: what is the current rate of progress, and the potential limit of progress.
Starting with the current rate of progress, at Deep Render we are currently achieving an improvement of 1.3x-1.5x in our compression performance each year. This rate of progress is not showing any signs of slowing down. If we compare this to traditional compression, the level of progress we are achieving each year is comparable to the progress achieved in approximately 12 years using traditional compression algorithms.
Moving on to the potential limit of progress, it’s undeniable that every technology innovation has its boundaries, and AI-based video compression is no exception. In theory, video compression should have a fundamental limit in line with the idea of Shannon entropy within information theory. This theoretical limit is yet to be reached or identified, in line with the saying “In theory, it’s obvious; in practice it’s unknown”!
If we step outside the realm of technology and into the realm of nature, we can find a potential comparison to video compression in the human eye. In some ways, the process of transmission of data from the human eye through optical nerves to our brains bears a striking resemblance to the bandwidth challenges faced in online video transmission.
Research conducted at the University of Pennsylvania, led by Vijay Balasubramanian, demonstrated that guinea pigs' optic nerves can transmit data at a rate of 0.875 Mbps. Extrapolating these findings to humans suggests an "eye-brain" bandwidth of 8.75 Mbps.
If we next consider the amount of data captured by the human eye, this is estimated to be up to 576 megapixels, at a frame rate of at least 240 fps, and a colour vision depth of 12 bits. These factors combine to generate 5 million Mbit of data per second.
If you combine the amount of data generated with the “eye-brain” bandwidth of 8.75 Mbps, the implication is a compression ratio of up to 570,000x. This difference vs the 1000x compression ratio achieved by Deep Render today underscores a crucial point: video compression is far from reaching its fundamental limit.
I’d like to finish this blog post by sharing a few more astounding statistics from the world of video (and video compression!):
All of these statistics show the different types of impact video has on our time, our experiences, and the world around us. More importantly for Deep Render, all of these impacts would be positively impacted by our improved video compression technology.