mercredi 16 juin 2010

Content Aware Video Encoding

It’s a thought I had since a few years, but recently I was inspired again by the Football World Cup going on in South Africa, so I decided to write about it.
The purpose is to improve video encoding in order to reduce network congestion by unicast/multicast video streams when diffusing a common broadcast show like a football game. The idea is to encode the video stream while considering what we already know about its content. In fact since we already know a lot of information about the specific content, for example the football field will be green!, we can try to use this information in the coding process, so we can somehow decrease the "entropy" of the stream and thus increase the compression of the video, and saving this way more bandwidth.

I can give many examples in the case of the football game:

1 - Decrease bits used to represent the green field, but conserving the quality of bits representing white strips on the field.
2 - Decrease bits used to represent the fans in the stadium, if the view is distant!
3 - Encode positions of players
4 - Focus on THE BALL!
5 - Players uniforms are the same for each team, use this information!!!

In fact essential information for watcher must be highly encoded, other information in a lower quality.

This kind of encoding would be possible for other kind of broadcast shows, for example a politician speech, where the background doesn't move, and is not important! this way each kind of broadcast show can have its own encoder. I called this Content Aware Video Encoding (CAVE, sexy name no? :) ).

I don't know how technically can this be implemented, but its worth studying to see how much we can reduce bandwidth using these kind of algorithms. A major criteria for adopting a possible CAVE encoder is the quality of the video, and the complexity of the algorithm. A drawback of CAVE is the need of alot of encoders for different shows. This might be critical on low resources devices such as mobile telephones.

1 commentaire :

  1. Nice subject! but as you wrote: the drawback is the computational complexity and the need of many encoders. But nothing is impossible . . .