A week ago, Netflix engineers have revealed a project on which they were working for some time now to optimize the QoE of videos they are streaming to their subscribers. Their approach is based on video encoding settings in the back-end rather than focusing only on the front-end (peering, codec, streaming format, CDN..). The expected gain is 20% on overall traffic which is huge for a player who is accounting for more than third of the global internet traffic. Additionally, Netflix will be delivering better quality at the same connection bandwidth, which is critical for addressing emerging markets.
Content providers encode several representations of the same video asset, where each representation is formed by a bitrate and a resolution. below is an example of representations of a 4:3 main profile video according to Apple recommendations:
The goal of having several representations is serving the best one to the user according to his screen (TV vs mobile) and connection (4G, ADSL, FTTH..). With adaptive streaming, the same user can change the representation during video play in order to adapt to the connection bandwidth variation.
So far content providers encoded all their video assets with the same set of representations. Netflix noticed that it doesn't make sense because for the same quality/resolution a cartoon movie requires less bitrate that an action film. Each video asset has its own "entropy" that should be taken into consideration when generating the representation set. This what Netflix is doing with their per-title encoding approach.
In order to know what is the best representation set for a title, Netflix will encode at different resolutions (480p, 720p, 1080p...), then for each resolution they will draw the "exponential" curve of quality (PSNR) vs encoding bitrate (black, green and blue curves). Notice that a 720p @ 400Kbps representation will have a worse quality than a 480p representation encoded at the same bitrate and upscaled to 720p. The optimal representation set will be the set of dots close to the red curve.
Of course this approach will cost more computing resources in the video preparation workflow, but the gains are worth it.
On the same subject, I read this interesting scientific article, where they propose not only looking to the type of the title (cartoon vs action film...) but also to its popularity, to the limits of contracted CDN capacity, of users screen resolution, of users connection, video storage... They make interesting findings on the optimal representation set:
- Titles with high "entropy" like action film requires more representations than low "entropy" titles like cartoons.
- The number of representations per resolution is dependent on the distribution of devices: HDTV vs mobile phones.
- For a given resolution, lower bitrates are closer one to the other than higher bitrates.
Of course these are the conclusions of the test conditions. It can be different for exemple if we consider a content provider targeting only mobile devices in emerging markets.
As the environment of the content provider is smoothly changing (proliferation of mobile devices, conquer strategies for emerging markets with low bandwidth connections, title popularity changing, versatile peering agreements...) I guess it would be interesting to dynamically re-encode in a continuous way the representations of the video library in order to guarantee the best global user QoE with respect to constraints imposed by this moving environment.
Aucun commentaire :
Enregistrer un commentaire