360 degree video is today often develired using 4K resolution [UVR]. 4K may sound like a lot, but out of the full 360 degree video, the user sees only a limited subset at any given point in time. As an example, the HTC Vive headset provides a horizontal FOV of 110 degrees and a vertical FOV of around 110 degrees [RE]. Thus, it can show only 9.3% of the full spherical view. This results in a lot of wasted bandwidth if the entire 360 degree view is streamed to the user. Another problem is that cropping 9.3% out of 4K and using it to fill the displays of the VR headset results in a very low resolution.
If the human eye was a digital camera, it would have 60 pixels/degree at the fovea [EC]. The human eye has a horizontal FOV of up to 208 degrees and a vertical FOV of 120 degrees [SY]. To reach 60 pixels/degree, one would need to 12480x7200 display or a 6240x7200 display for each eye. These kinds of resolutions may become possible in smartphone displays in the coming years – an example is Samsung who is working on a 11K smartphone display. 11K could mean something like 10560x5940 pixels. One such display positioned horizontally would provide around 50 pixels/degree. Two such displays (one for each eye) positioned vertically would provide 57 pixels/degree for the horizontal FOV (and much more for the vertical FOV). Thus, 60 pixels/degree might be reachable in the not-too-distant future – Samsung expects to have a prototype of its 11K display ready by 2018 [PL].
But back to the title of this blog post - how much bandwidth does high-quality VR require? Let’s start from 8K video. As an example, the Japanese broadcaster NHK has developed a real-time HEVC encoder that can encode a 60fps 7680x4320 (8K) video at 85Mbps with a 10-bit bit depth per color and 4:2:0 chroma subsampling [AT, NTT], giving a compression ratio of 350:1 [WP1] and what NHK calls an acceptable level of quality [IEEE]. The encoder’s maximum bit rate is 340Mbps. These figures are for monoscopic video – stereoscopic video could add an overhead on the order of 25-30% [MERL], 40% [ALU], or 50% [IN] if using the Multi-View Video Coding (MVC) standard used by Blu-ray for 3D video. Assuming a 40% overhead, steorescopic 360 degree 8K@60fps video would require 119,4 Mbps if assuming NHK’s aggressive compression ratio.
But in order to reduce motion sickness and achieve low motion-to-photon latency, VR needs frame rates higher than 60fps. The raw bit rate of 4320p240fps video with a 10-bit depth per color and 4:2:0 chroma subsampling would be 119,4Gbps (a 240fps frame rate is possible with HEVC, since HEVC can support an up to 300fps frame rate [WP2]). It should be noted that doubling (or quadrupling) the frame rate does not result in a doubled (or quadrupled) bitrate as the amount of temporal redundancy between consecutive frames increases and can be efficiently exploited by video codecs [CO]. In fact, it appears that increasing the frame rate to very high levels will eventually result in the resulting increase in the bitrate starting to level off [II]. As an example, according to [CV], doubling the bitrate of 720p video from 30fps to 60fps increases the bitrate by a factor of 1,48 – 1,63 when using H.264. And based on [TBB], for HEVC, going from 30fps to 60fps, results in a bit rate increase between 19-42%, whereas a lower bit rate increase of 17-28% is observed when going from 60fps to 120fps. Assuming that the increase in bit rate continues to reduce at a similar rate when further increasing the frame rate due to increasing temporal redundancy, we could produce a rough estimate that stereoscopic 4320p240fps video would require around 366Mbps if assuming a 200:1 compression ratio.
If using a theoretical hybrid of the 11K smartphone display that Samsung is developing and a 1700Hz (“zero latency”) display prototype that Nvidia has demonstrated [RV] and which would be helpful for eliminating motion sickness, one could build a VR headset having two 10560x5940 displays having a refresh rate of 1700Hz. A 10560x5940@1700fps video would produce a raw (uncompressed) bit rate of 1599,5Gbps. If assuming a future video codec and hardware that can handle such resolutions and frame rates and that the codec can exploit the temporal redundancy among the frames, the resulting video would probably have an encoded bitrate of a few hundreds of Mbit/s (or more specifically, around 2x780Mbps if using the assumptions described above).
The above estimates are valid for the so-called field-of-view streaming where the full resolution (e.g., 8K) is used for streaming the viewport to the user’s VR headset. The figures would be in a completely different ballpark if the full 360 degree video is streamed to the user at a resolution that allows the user to extract a full-resolution (e.g., 8K) viewport from it. Streaming a full 360 degree video from which the user can extract an 11K viewport would require something like a resolution of 25344x17820 pixels. A rough estimate of the bitrate required to stream such a resolution at 1700fps would be close to 5600Mbps. However, it is possible to go down from this figure – Facebook has developed a video coding technique that encodes 360 video with a pyramid geometry. The technique can reduce file size by 80%. Thus, applying pyramid encoding could reduce the bitrate down to around 1120Mbps. Additional techniques such as foveating encoding [WP3] combined with eye tracking have potential to bring bit rates even further down.
So how does the situation look like today when it comes to Internet connection speeds? According to an FCC report [BGR], the average connection speed in the US in September 2014 was 31Mbps. Thus, even aggressively compressed 8K video is still out of reach for the average Internet user.
[BGR] Average U.S. Internet speed has more than tripled since 2011, http://bgr.com/2016/01/02/us-internet-speeds-average/
If the human eye was a digital camera, it would have 60 pixels/degree at the fovea [EC]. The human eye has a horizontal FOV of up to 208 degrees and a vertical FOV of 120 degrees [SY]. To reach 60 pixels/degree, one would need to 12480x7200 display or a 6240x7200 display for each eye. These kinds of resolutions may become possible in smartphone displays in the coming years – an example is Samsung who is working on a 11K smartphone display. 11K could mean something like 10560x5940 pixels. One such display positioned horizontally would provide around 50 pixels/degree. Two such displays (one for each eye) positioned vertically would provide 57 pixels/degree for the horizontal FOV (and much more for the vertical FOV). Thus, 60 pixels/degree might be reachable in the not-too-distant future – Samsung expects to have a prototype of its 11K display ready by 2018 [PL].
But back to the title of this blog post - how much bandwidth does high-quality VR require? Let’s start from 8K video. As an example, the Japanese broadcaster NHK has developed a real-time HEVC encoder that can encode a 60fps 7680x4320 (8K) video at 85Mbps with a 10-bit bit depth per color and 4:2:0 chroma subsampling [AT, NTT], giving a compression ratio of 350:1 [WP1] and what NHK calls an acceptable level of quality [IEEE]. The encoder’s maximum bit rate is 340Mbps. These figures are for monoscopic video – stereoscopic video could add an overhead on the order of 25-30% [MERL], 40% [ALU], or 50% [IN] if using the Multi-View Video Coding (MVC) standard used by Blu-ray for 3D video. Assuming a 40% overhead, steorescopic 360 degree 8K@60fps video would require 119,4 Mbps if assuming NHK’s aggressive compression ratio.
But in order to reduce motion sickness and achieve low motion-to-photon latency, VR needs frame rates higher than 60fps. The raw bit rate of 4320p240fps video with a 10-bit depth per color and 4:2:0 chroma subsampling would be 119,4Gbps (a 240fps frame rate is possible with HEVC, since HEVC can support an up to 300fps frame rate [WP2]). It should be noted that doubling (or quadrupling) the frame rate does not result in a doubled (or quadrupled) bitrate as the amount of temporal redundancy between consecutive frames increases and can be efficiently exploited by video codecs [CO]. In fact, it appears that increasing the frame rate to very high levels will eventually result in the resulting increase in the bitrate starting to level off [II]. As an example, according to [CV], doubling the bitrate of 720p video from 30fps to 60fps increases the bitrate by a factor of 1,48 – 1,63 when using H.264. And based on [TBB], for HEVC, going from 30fps to 60fps, results in a bit rate increase between 19-42%, whereas a lower bit rate increase of 17-28% is observed when going from 60fps to 120fps. Assuming that the increase in bit rate continues to reduce at a similar rate when further increasing the frame rate due to increasing temporal redundancy, we could produce a rough estimate that stereoscopic 4320p240fps video would require around 366Mbps if assuming a 200:1 compression ratio.
If using a theoretical hybrid of the 11K smartphone display that Samsung is developing and a 1700Hz (“zero latency”) display prototype that Nvidia has demonstrated [RV] and which would be helpful for eliminating motion sickness, one could build a VR headset having two 10560x5940 displays having a refresh rate of 1700Hz. A 10560x5940@1700fps video would produce a raw (uncompressed) bit rate of 1599,5Gbps. If assuming a future video codec and hardware that can handle such resolutions and frame rates and that the codec can exploit the temporal redundancy among the frames, the resulting video would probably have an encoded bitrate of a few hundreds of Mbit/s (or more specifically, around 2x780Mbps if using the assumptions described above).
The above estimates are valid for the so-called field-of-view streaming where the full resolution (e.g., 8K) is used for streaming the viewport to the user’s VR headset. The figures would be in a completely different ballpark if the full 360 degree video is streamed to the user at a resolution that allows the user to extract a full-resolution (e.g., 8K) viewport from it. Streaming a full 360 degree video from which the user can extract an 11K viewport would require something like a resolution of 25344x17820 pixels. A rough estimate of the bitrate required to stream such a resolution at 1700fps would be close to 5600Mbps. However, it is possible to go down from this figure – Facebook has developed a video coding technique that encodes 360 video with a pyramid geometry. The technique can reduce file size by 80%. Thus, applying pyramid encoding could reduce the bitrate down to around 1120Mbps. Additional techniques such as foveating encoding [WP3] combined with eye tracking have potential to bring bit rates even further down.
So how does the situation look like today when it comes to Internet connection speeds? According to an FCC report [BGR], the average connection speed in the US in September 2014 was 31Mbps. Thus, even aggressively compressed 8K video is still out of reach for the average Internet user.
[ALU] Bandwidth demand forecasting, http://www.ieee802.org/3/ad_hoc/ngepon/public/sep14/harstead_ngepon_01a_0914.pdf 
[AO] Virtual Reality Check: Are Our Networks Ready for VR?, http://blog.advaoptical.com/virtual-reality-check-are-our-networks-ready-for-vr 
[AT] Live 8Kp60 Demo Based on World’s First Single-Card 8K HEVC Encoder Board Wins Award at CEATEC!, http://blog.advantech.com/tech-blogs/ntg/2016/10/live-8kp60-demo-based-worlds-first-single-card-8k-hevc-encoder-board-wins-award-ceatec/ 
[Arris] ARRIS Gives Us a Hint of the Bandwidth Requirements for VR, http://www.onlinereporter.com/2016/06/17/arris-gives-us-hint-bandwidth-requirements-vr/ 
[BGR] Average U.S. Internet speed has more than tripled since 2011, http://bgr.com/2016/01/02/us-internet-speeds-average/
[CO] High frame rate television, http://cognitus-h2020.eu/index.php/hfr/ 
[CV] What you really need to know about Video Conferencing Systems, http://www.c21video.com/whitepapers/what_to_know_about_video_conferencing.html 
[EC] What Kind of a Resolution Is Needed to Deliver Perfect VR?, http://edge-of-cloud.blogspot.fi/2016/11/what-kind-of-resolution-is-needed-to.html 
[FB] Next-generation video encoding techniques for 360 video and VR, https://code.facebook.com/posts/1126354007399553/next-generation-video-encoding-techniques-for-360-video-and-vr/ 
[IEEE] HEVC Encoder for Super Hi-Vision, http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6775906 
[II] Compression Standards, http://www.interintelteam.com/CompressionStandards.htm 
[IN] An Overview Digital Images and Video: Display, Representations, and Standards, http://www.informit.com/articles/article.aspx?p=2418908&seqNum=4 
[JB] Nhk Research Labs Serves As Breeding Ground For 8k Innovation, http://www.japanbullet.com/technology/nhk-research-labs-serves-as-breeding-ground-for-8k-innovation 
[MERL] Representation and Coding Formats for Stereo and Multiview Video, http://www.merl.com/publications/docs/TR2010-011.pdf 
[NTT] NTT develops the worlds’ best performance 8K HEVC real-time encoder through dedicated LSI use, http://www.ntt.co.jp/news2016/1602e/160215a.html 
[PL] Forget about 4K and even 8K, Samsung is making 11K displays, http://www.pocket-lint.com/news/134580-forget-about-4k-and-even-8k-samsung-is-making-11k-displays 
[RE] FOV Comparison, https://www.reddit.com/r/Vive/comments/4ceskb/fov_comparison/ 
[RV] NVIDIA Prototype 1,700Hz Zero Latency Display, http://www.roadtovr.com/nvidia-demonstrates-experimental-zero-latency-display-running-at-17000hz/ 
[SY] Parameters of Human Vision and Viewshed Definition, http://www.stockyardhillwindfarm.com.au/pdf/PPAR_Annexes/ATS/Annexes/Annex_J/AnnexJ-LVA_PART_12.pdf 
[TBB] HEVC, The Key to Delivering an Enhanced Television Viewing Experience, “Beyond HD”, https://www.thebroadcastbridge.com/content/entry/992/hevc-the-key-to-delivering-an-enhanced-television-viewing-experience-beyond 
[UVR] Valve Partners with Pixvana to Improve 360-Degree Video Experience, http://uploadvr.com/valve-pixvana-360-video/ 
[WP1] High Efficiency Video Coding implementations and products, https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding_implementations_and_products 
[WP2] High Efficiency Video Coding tiers and levels, https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding_tiers_and_levels 
[WP3] Foveated Imaging, https://en.wikipedia.org/wiki/Foveated_imaging
