Saturday, April 8, 2017

What is 5G Network Slicing?

If you have been reading as many 5G white papers as I have, you are already familiar with the vision that 5G will be more than a new Radio Access Technology (RAT), that is, 5G New Radio (5G NR), and additional spectrum. This will become true when the industry moves from NR Non-Standalone, which uses existing LTE radio and core network as an anchor for mobility management and coverage while adding a new 5G carrier, to NR Standalone (i.e., “full 5G” including a new 5G core). Besides NR, one of the other key 5G features is network slicing [3GPP-1]. The aim of network slicing is to transform the network from using a rather static one-network-fits-all approach to a dynamic model where multiple logical networks can be created on top of the same physical infrastructure to fulfill the diverse needs of different use cases such as enhanced Mobile BroadBand (eMBB), critical Internet of Things (IoT), and massive IoT.

A network slice is a logical network that includes a set of network functions, which may be Virtual Network Functions (VNFs) or Physical Network Functions (PNFs), and corresponding resources, such as compute, storage, and networking resources. It is sliced out from the physical network in order to provide specific capabilities and characteristics (e.g., customized connectivity with ultra-low latency, extreme reliability, and/or value-added services) that the application running within the slice requires [3GPP-2]. A slice could also be seen as a unique profile for an application, defined as a set of services within the network [WW], or as a network function chain built to support a given use case, traffic type or a customer [ETSI].

The instantiation of a network slice is called a Network Slice Instance (NSI). An NSI is a deployed set of network functions and it delivers network slice services according to a network slice blueprint, which is a description of the structure, configuration and the plans or work flows for how to instantiate and control the Network Slice Instance during its lifecycle [IETF-2, NGMN-2]. NSI is the end-to-end realization of network slicing. The NSI contains access, core, and transport network functions and information relevant to the interconnections between these network functions such as the topology of connections and individual link requirements like Quality of Service (QoS) attributes [3GPP-1].

A slice may serve a particular purpose or a specific category of services (e.g., a specific use case or a specific traffic type) or even individual customers, in which case it may be created on-demand following a Network-as-a-Service (NaaS) approach [3GPP-1].

Each network slice exists within a Software Defined Networking (SDN) overlay [BRO]. Within the slice, a collection of network functions is chained together according to the requirements of the service using the slice [WW, 5GA]. Each slice sharing the same underlying physical infrastructure is a logically separated and isolated system that can be designed with different network architecture, engineering mechanism and network provisioning. When appropriate, slices can share functional components.

Examples of Network Slices


Some examples of network slices and use cases using them include [NGMN, NM, KCF]:

  • A 5G slice for an Ultra High Definition (UHD) or Virtual Reality (VR) video streaming service could include for example a virtualized BBU (BaseBand Unit) VNF, a 5G core network User Plane (UP) and Control Plane (CP) VNFs, a cache server running as a VNF in the edge cloud, and a Mobile Video Optimization (MVO) server VNF in a central cloud.
  • A 5G slice for typical eMBB smartphone use (e.g., voice or video calls) could include a virtualized BBU in the edge cloud. The core cloud could include a 5G core UP and CP VNFs (e.g., access management, session management, full mobility management, and charging), and IP Multimedia Subsystem (IMS) and TCP optimization VNFs running in a central cloud
  • A 5G slice supporting a critical IoT (e.g., self-driving car) use case is likely to require a high level of security, high level of mobility, extreme reliability, and extremely low latency. For such a slice, all the needed network functions (e.g., 5G core UP and a V2X (Vehicle-to-Everything) server VNFs) and the control software for the self-driving car can be instantiated in a node located at the edge of the network (i.e., in the edge cloud) to meet the very strict latency requirements.
  • For a network slice supporting a massive IoT use case (e.g., stationary weather sensors infrequently sending small amounts of data), could include a virtualized BBU in the edge cloud. The core cloud could include some basic light-weight 5G core control plane functions (e.g., access management and session management) configured without mobility management features together with contention-based resources for access, and an IoT server VNF.
  • In addition to the four slices above, there could a generic slice that provides best-effort connectivity for unknown use cases and traffic types

Some of the above-mentioned slices are illustrated in Figure 1 below. More examples of network slicing use cases are available for example in [IETF-6].


Figure 1 – Examples of network slices

As we can observe, all of the use cases above require different types of features and network characteristics when it comes to for example to mobility, charging, security, policy control, latency, reliability, bandwidth, and so forth. Not all use cases will require all of the features that the network is capable of providing. Furthermore, in some cases, the applications will be brought closer to the users to the edge cloud and is some cases the RAN and/or the transport network will require a specific configuration.

Network Slicing Related Technologies 


Network slicing leverages multiple different technologies and concepts, including [NOK-1, HW]:

  • End-to-end management and orchestration – an E2E management and orchestration entity configures the infrastructure layer (which contains all physical resources of the network, including compute, storage and networking), business enablement layer (which includes the network functions), and business application layer (which consists of applications or services of the network operator or tenant) according to the needs of a the application using the slice and supervises them during runtime [5GN]. This includes defining the network slices, chaining together the required VNFs and PNFs and mapping them on the infrastructure equipment. It also includes resource management, scaling the capacity of network functions and managing their geographic distribution. The E2E management and orchestration entity will build on technologies such as NFV, SDN, and SON (Self-Organizing Network). The management and orchestration entity communicates with the OSS/BSS of the network service provider to among other things report status and receive requirements [5GPPP].
  • Distributed cloud infrastructure, which refers to virtual infrastructure distributed across a mobile network. In the most ambitious visions, nearly any node in the network will be capable of hosting distributed applications and VNFs [ER-5]. This brings applications closer to (or perhaps even into) the RAN and enables optimizing network characteristics like latency and bandwidth.
  • Service chaining – to support diverse services, the E2E management and orchestration system can orchestrate multiple network and service functions by chaining them together on demand in a specific order to create control plane and data plane graphs [5GPPP]. 
  • Cloud-native design, including for example microservices, containers, and a Service Oriented Architecture (SOA) in which libraries of network functions can be requested from a VNF catalog and composed into end-to-end service chains on demand [HW]
  • Network Function Virtualization (NFV) – NFV installs network function software (VNFs), such as MME (Mobility Management Entity), PDN (Public Data Network) Gateway, SGW (Serving Gateway) and PCRF (Policy and Charging Rules Function) in the core network and BBU in the RAN, into Virtual Machines (VMs; or perhaps into containers in a more cloud-native system) deployed on virtualized COTS (Common Off-The-Shelf) servers instead of onto their dedicated network equipment individually [KCR]
  • Software-Defined Networking (SDN) – SDN provisions the connectivity among VMs or containers located in edge and core clouds [KCR].  An SDN controller configures the service chain of network functions (VNFs) as a logical overlay among them [5GPPP].
  • Analytics and machine learning are required to make the network slicing process highly automated [NOK]. High levels of automation will be needed since running multiple slices on top of the same physical infrastructure adds complexity to network operations and optimization. Analytics and machine learning will be used to enhance end-to-end management and orchestration (see the first bullet).

End-to-End Network Slicing


A network slice can span all the domains of the network, including [5GA, NGMN, 3GPP-2]:

  • RAN slicing: for example, a dedicated radio configuration (RAT settings) or specific RAT.
  • Transport network slicing: for example, a specific configuration of the transport network to support flexible placement of network functions. A network slice includes also the transport resources between the network functions.
  • Core network slicing: here the slice includes the control plane and user plane network functions and their resources, such as compute, storage, and network resources. In the core network, NFV and SDN virtualize the network elements and functions in each slice to meet the requirements of the user of the slice
  • Edge cloud and data centers [IETF-3]: slicing covers also applications or VNFs (e.g., a V2X application) running on cloud nodes in an edge cloud (where RAN works as an edge cloud [KCR]) or a central data center
  • Specific configuration of a 5G device and any dedicated resources that an end-user device allocates to a specific NSI [IETF-2]

More information about RAN, core network and transport network slicing is available in the sections below.

The details of how network slicing will work are still being nailed down. However, it is likely that slice selection will be controlled by the network [HW]. The UE will be able to provide network slice selection assistance information based on policy. The UE may belong to more than one slice on the same RAN and core network.

RAN Slicing


In the RAN, slicing can be built on physical radio resources or on logical resources abstracted from physical radio resources [5GA]. RAN slicing could be implemented by mapping a slice identifier to a set of configuration rules applied to the RAN control plane (i.e., RRC, Radio Resource Control in the case of LTE) and user plane functions, that is PDCP (Packet Data Convergence Protocol), RLC (Radio Link Control), MAC (Medium Access Control) in the case of LTE. Some network functions, such as mobility management, can be common to several slices. There will also be common control functions that coordinate RAN resource usage among the slices.

Radio slices in the RAN may share radio resources (time, frequency, space) and the corresponding communication hardware such as digital baseband processing components or analog radio components [5GA]. The sharing could be done in a dynamic or static manner depending on the network slice’s configuration rules. In case of dynamic sharing, each slice obtains resources based on its demand and priority using either scheduling (the slice requests resources from a central scheduler, which allocates them for example based on the overall traffic load or the priority of the slice) or contention. In the case of static sharing, a slice is pre-configured to operate in a dedicated resource throughout its operation time. Static sharing enables guaranteed resource allocation to the slice, whereas dynamic resource sharing allows overall resource usage optimization.

Slice-specific configuration rules for the control plane functions tailor the RAN control plane functions for each slice. This is because not all slices will require all control plane functions. In the RAN Layer 2 (L2) user plane, each slice may have a configured user plane protocol stack. For example, a slice supporting different QoS classes could be configured to use separate packet segmentation and scheduling layers similar to the RLC (Radio Link Control) and MAC (Medium Access) layers in LTE.

There is also a need for network slice specific admission control so that the system can meet the initial access requirements of various network slices [5GA]. For example, a network slice for critical IoT must get guaranteed low-latency access - consider for instance a slice supporting a remote surgery use case.

The end-to-end network slice also includes the virtualized BaseBand Unit (BBU) - the waveform samples carried in the fronthaul network are processed by a VNF that implements the BBU functions instead of a physical BBU [5GPPP].

How much of the baseband processing is virtualized as VNFs and centralized depends on front-haul availability [NOK] – in some cases, such as with extreme 5G data rates and massive beamforming, CPRI (Common Public Radio Interface, i.e., transmission links between central BBUs and distributed RRHs) fronthaul capacity requirements may be too high and prevent fully centralized baseband processing [ER-6]. Thus, the radio architecture needs to be flexible to cope with different fronthaul deployments. The RAN control plane, is more straightforward to centralize since it does not have extreme bitrate requirements [ER-6]. To enable massive beamforming, the physical layer (PHY) could be distributed closer to the antenna while the upper layers in the stack, that is, MAC, RLC, PDCP, and PHY (physical layer), could be centralized. In fully centralized based deployment, all baseband processing (RAN L1, L2, and L3 protocol layers) is located in a central data center.

Core Network Slicing


The 5G core network (a.k.a., Next-Generation Core, NGC) will be a key element for network slicing since it manages sessions, QoS, security, policy, and so forth [HW]. Network slicing allows core networks to be built in a flexible way [ER-2], tailored to the needs of the application using the slice. SDN and NFV allow traditional network architectures to be broken down into customizable elements (VNFs) that can be chained together programmatically using Service Function Chaining (SFC; a.k.a., service chaining) to provide the required level of connectivity and network functionality. An SDN controller configures service chaining to build a network slice for each user or service [ER-4]. SFC makes it possible to dynamically configure user plane traffic to be routed through a chain of network components which provide value added services [SS].

In LTE, service function chains live in the SGi-LAN [IETF-1], which is the part of the LTE network between the Evolved Packet Core (EPC) and the PDN (Packet Data Network, e.g., Internet) gateway [Int-1]. It is where service providers deploy IP functions such as firewalls, NATs (Network Address Translators), policy and charging enforcement functions, traffic detection functions, CDN (Content Delivery Network) functions, video transparent caching functions, and video optimization functions. In coordination with SDN and NFV, service chaining in the SGi-LAN can be used to optimally steer traffic through selected SGi-LAN network functions.

5G networks are expected to remove the separation between the network service functions that are traditionally placed in the SGi-LAN and the connectivity service, that is, the core network and RAN [NOK]. This allows service chaining to take advantage of edge clouds. Thus, virtualized SGi-LAN functions can be chained together with other VNFs (including the 5G core network VNFs) into a network slice and placed wherever appropriate, including router-based compute blades, COTS x86 servers deployed at the edge, or in central data centers [AF]. The placement decisions are made by a central management and orchestration system.

When it comes to interconnection between core network control plane functions, in addition to traditional point-to-point interfaces (which is the traditional 3GPP architecture that defines functions and interfaces between them), for the core network control plane, 5G is expected to also use a service oriented (cloud-native) model in which components query a Network Function Repository Function (NRF) to discover and communicate with each other [3GPP-3, HW]. The NRF allows libraries of functions to be requested from a VNF catalog and composed into service chains on demand. Using such a service-oriented model, network functions on the control plane can access each other’s services through APIs (Application Programming Interfaces) 3GPP-2].

A 5G device will associate an application with one out of multiple parallel PDU (Packet Data Unit [5GA-2]) sessions, each PDU session corresponding to one core network slice and one RAN slice [IETF-5]. A PDU session is a 5G concept for an association between the device and a data network (e.g., IP or Ethernet). Part of the core network control plane may be common to several slices. One of common control plane functions is the Slice Selection Function (SSF), which is in charge of selecting a core network slice instance. The SSF not specific to any particular network slice [3GPP-3]. Other common control plane network functions can include the Access and Mobility Management Function (AMF), Authentication Function, and NAS (Non-Access Stratum) proxy function. Different slices may also have dedicated core network control plane functions such as the Session Management Function (SMF) which manages PDU sessions. User plane functions are dedicated to each slice.

Transport Network Slicing


In addition to the RAN and core network, also the transport network will play a key role in 5G in enabling flexibility and the ability to address the requirements of the broad range of use cases that 5G will need to support [5GPPP]. To make the transport network programmable and capable of supporting network slicing, SDN and NFV need to be used to separate the control and data planes. This will allow the transport network to dynamically interconnect distributed 5G RAN and core network functions hosted on distributed cloud infrastructure (i.e., the “network cloud”). In general, the 5G transport network will consist of integrated optical and wireless (e.g., mmWave transmission of fronthaul and backhaul) network infrastructure. In 5G transport networks, the fronthaul and backhaul will become one unified IP network known as the 5G Xhaul [EOC].

Programmability will allow the transport network to support the requirements that network slicing brings, including the support for diverse use cases, network sharing, isolation, Service Level Agreement (SLA) fulfillment, short Time-To-Market (TTM), efficient scaling, interconnection of distributed network functions over links that fulfill the required performance levels for bandwidth, delay and availability, etc. SDN-based resource and service orchestration will orchestrate not only the RAN, core, and cloud, but also the transport network, and a 5G network slice will also include the transport layer [ER-1].

Introducing a fully centralized control plane (i.e., full SDN) might be easier for optical transport than packet networks since in legacy optical transport networks, most control functions are already separated from the data plane nodes [ER-1]. In contrast, in legacy packet-switched transport nodes, the two planes are often tightly coupled.

As an example of transport network slicing, Deutsche Telekom (DT) has demonstrated bandwidth based transport network slicing (i.e., slicing based on the bandwidth requirements of applications [Wang]). DT’s transport network consists of T-SDN (Transport SDN) controllers and underlying DWDM (Dense Wavelength Division Multiplexing) nodes. The controllers generate a series of specific data forwarding paths based on slice topology and service requirements [TEL].

Network slicing needs to be configured both for the backhaul transport network (between the RAN/edge cloud and the core cloud) and for the fronthaul transport network (between the edge cloud and a RRH at the cell site) [KCR].

Between the edge and core clouds (i.e., backhaul), inter-VM networking is set up with IP/MPLS-SDN (Internet Protocol / Multi-Protocol Label Switching SDN) and T-SDN [KCR]. IP/MPLS networks, as one of the widely deployed mobile transport networks, need to provide the functionality and capability required by network slicing [IETF-4]. The hypervisor of a virtualized server in the core cloud might run a vRouter/vSwitch. The SDN controller does provisioning of the virtualized server Datacenter (DC) Gateway (GW) routers (i.e., Provider Edge (PE) router of MPLS L3 (Layer 3) VPN (Virtual Private Network) installed in the cloud DC) to create SDN tunnels (i.e., MPLS GRE (General Routing Encapsulation), VXLAN (Virtual Extensible Local Area Network)) between each VM in the core cloud (e.g., 5G IoT core) and the DC GW router. The SDN controller then performs mapping between these tunnels and MPLS L3 VPNs (such as an IoT VPN). The process is the same in the edge cloud, creating an IoT slice connecting from the edge cloud to the IP/MPLS backbone, and all the way to core cloud.

Also the fronthaul between the edge cloud and a 5G NR RRH needs to be sliced [KCR]. It is envisioned that the 5G Cloud RAN will use packet-based T-SDN as the fronthaul network to transport both control plane and data plane traffic between RRHs and BBUs [ODL-1]. RRH slicing, which is a new type of RAN sharing, could segment a physical RRH into multiple slices. It could dynamically create a virtual RRH (vRRH) instance using a single or multiple RRH slices, and then associate the vRRH to the BBU requesting it. This means that a novel RRH infrastructure is needed to provide virtualization and programmability of RRH networks for Cloud RAN.

The OpenDaylight open source SDN controller project is working on an OCP (Open Radio Equipment Interface Control & Management Protocol) plugin [ODL-2]. OCP is an ETSI standard protocol for control and management of RRH equipment. The OpenDaylight OCP project addresses the need for a southbound plugin that allows applications and controller services to interact with RRHs using OCP. The addition of the OCP plugin to OpenDaylight will make it possible to build an RRH controller on top of OpenDaylight to centrally manage deployed RRHs and integrate the RRH controller with T-SDN, achieving joint RRH and fronthaul network provisioning in Cloud RAN.

References


[3GPP-1] 3GPP TR 28.801 V1.0.0 (2017-03), Technical Report, 3rd Generation Partnership Project;
Technical Specification Group Services and System Aspects; Telecommunication management; Study on management and orchestration of network slicing for next generation network  (Release 14), http://www.3gpp.org/ftp//Specs/archive/28_series/28.801/28801-100.zip

[3GPP-2] 3GPP TS 23.501 V0.3.0 (2017-02), Technical Specification, 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; System Architecture for the 5G System; Stage 2(Release 15), http://www.3gpp.org/ftp//Specs/archive/23_series/23.501/23501-030.zip

[3GPP-3] 3GPP TR 23.799, Technical Report, 3rd Generation Partnership Project; Study on Architecture for Next Generation System, Release 14, https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3008

[5GA] Network Slicing for 5G Networks & Services, http://www.5gamericas.org/files/3214/7975/0104/5G_Americas_Network_Slicing_11.21_Final.pdf

[5GA-2] Wireless Technology Evolution towards 5G: 3GPP Release 13 to Release 15 and beyond, http://www.5gamericas.org/files/6814/8718/2308/3GPP_Rel_13_15_Final_to_Upload_2.14.17_AB.pdf

[5GN] 5G NORMA Deliverable D3.1, Functional Network Architecture and Security Requirements,
https://5gnorma.5g-ppp.eu/wp-content/uploads/2016/11/5g_norma_d3-1.pdf

[5GPPP] 5G PPP Architecture Working Group, View on 5G Architecture, https://5g-ppp.eu/wp-content/uploads/2014/02/5G-PPP-5G-Architecture-WP-For-public-consultation.pdf

[AF] Designing 5G-Ready Mobile Core Networks, https://www.affirmednetworks.com/wp-content/themes/renden/pdf/5G_Whitepaper_Heavy_Reading.pdf

[BRO] The Path to 5G with Programmable Mobility Management, https://www.brocade.com/content/dam/common/documents/content-types/whitepaper/brocade-path-to-5g-with-programmable-mobility-management-wp.pdf

[EOC] What is 5G Xhaul, http://edge-of-cloud.blogspot.fi/2017/03/what-is-5g-xhaul.html

[ER-1] Flexibility in 5G Transport Networks: The Key to Meeting the Demand for Connectivity, https://www.ericsson.com/res/thecompany/docs/publications/ericsson_review/2015/etr-5g-transport-networks.pdf

[ER-2] A Vision of the 5G Core: Flexibility for New Business Opportunities, https://www.ericsson.com/res/thecompany/docs/publications/ericsson_review/2016/etr-5G-core-vision.pdf

[ER-3] X-haul, fronthaul and backhaul network research, https://www.ericsson.com/research-blog/5g/x-haul-fronthaul-and-backhaul-network-research/

[ER-4] Why 5G Network Slices? https://www.ericsson.com/spotlight/cloud/blog/2015/02/17/5g-network-slices/

[ER-5] The Programmable Network Cloud, https://www.ericsson.com/res/docs/whitepapers/wp-the-programmable-network-cloud.pdf

[ER-6] Cloud RAN, https://www.ericsson.com/res/docs/whitepapers/wp-cloud-ran.pdf

[ETSI] Next Generation Protocols (NGP); Scenarios Definitions, http://www.etsi.org/deliver/etsi_gs/NGP/001_099/001/01.01.01_60/gs_NGP001v010101p.pdf

[HW] Service-Oriented 5G Core Networks, http://www-file.huawei.com/~/media/CORPORATE/PDF/white%20paper/Heavy%20Reading%20Whitepaper-%20Service-Oriented%205G%20Core%20Networks.pdf

[IETF-1] Service Function Chaining Use Cases in Mobile Networks, https://tools.ietf.org/html/draft-ietf-sfc-use-case-mobility-07

[IETF-2] Network Slicing Architecture, draft-geng-netslices-architecture-00, https://tools.ietf.org/html/draft-geng-netslices-architecture-00

[IETF-3] Network Slicing - Introductory Document and Revised Problem Statement, draft-gdmb-netslices-intro-and-ps-02, https://tools.ietf.org/html/draft-gdmb-netslices-intro-and-ps-02

[IETF-4] Problem Statement of Network Slicing in IP/MPLS Networks, draft-dong-network-slicing-problem-statement-00, https://tools.ietf.org/html/draft-dong-network-slicing-problem-statement-00

[IETF-5] Network Slicing - 3GPP Use Case, draft-defoy-netslices-3gpp-network-slicing-00, https://tools.ietf.org/html/draft-defoy-netslices-3gpp-network-slicing-00

[IETF-6] Network Slicing Use Cases: Network Customization for Different Services, draft-qin-netslices-use-cases-00, https://tools.ietf.org/html/draft-qin-netslices-use-cases-00

[Int-1] Etisalat and Intel Virtualizing the Internet Gateway Gi-LAN for Service Flexibiity, https://builders.intel.com/docs/networkbuilders/Etisalat-and-Intel-virtualizing-the-internet-gateway-Gi-LAN-for-service-flexibility.pdf

[KCR] Korea Communication Review Q4 2015, http://5g.itrc.ac.ir/sites/default/files/korea_communication_review.pdf

[NGMN] NGMN 5G White Paper, https://www.ngmn.org/uploads/media/NGMN_5G_White_Paper_V1_0.pdf

[NGMN-2] NGMN 5G P1 Requirements & Architecture Work Stream End-to-End Architecture Description of Network Slicing Concept by NGMN Alliance, https://www.ngmn.org/uploads/media/160113_Network_Slicing_v1_0.pdf

[NM] E2E Network Slicing - Key 5G technology : What is it? Why do we need it? How do we implement it? http://www.netmanias.com/en/post/blog/8325/5g-iot-network-slicing-sdn-nfv/e2e-network-slicing-key-5g-technology-what-is-it-why-do-we-need-it-how-do-we-implement-it

[NOK] 5G - a System of Systems for a programmable multi-service architecture, http://resources.alcatel-lucent.com/asset/200012

[ODL-1] Project Proposals: OCP Plugin, https://wiki.opendaylight.org/view/Project_Proposals:OCP_Plugin

[ODL-2] OCP Plugin: Main, https://wiki.opendaylight.org/view/OCP_Plugin:Main

[SS] The Road from EPC to 5G, https://www.slideshare.net/AlbertoDiez4/mobile-plots-from-epc-to-5g

[TEL] Deutsche Telekom and Huawei demonstrate world’s first 5G E2E autonomous network slicing, https://www.telekom.com/en/media/media-information/archive/5g-autonomous-network-slicing-demonstrated-444778

[TP] LTE Overview, https://www.tutorialspoint.com/lte/lte_quick_guide.htm

[Wang] An Optimal Slicing Strategy for SDN based Smart Home Network, http://ir.siat.ac.cn:8080/bitstream/172644/6290/1/%E5%8D%97%E6%B2%99%E6%89%80-%E6%99%BA%E6%8E%A72014008.pdf

[WW] Elements of a 5G Core Network, https://www.wirelessweek.com/article/2016/10/elements-5g-core-network

Tuesday, April 4, 2017

How Does a Jitter Buffer Work?

In an earlier blog post, I wrote about the components of motion-to-photon latency in Virtual Reality (VR) systems [EOC]. In most video streaming applications, the dominant latency contributor is buffering at the receiving (decoder) side. This blog post will take a closer look at buffering, focusing on jitter buffers.

In IP networks, jitter refers to the variation in latency on a packet flow between two systems [TT]. Jitter causes some packets to take longer to travel from the sender to the receiver. Jitter results from network congestion, timing drift and route changes. Due to jitter, packets may arrive at the destination late, they may arrive out of order, or may get completely lost if for example buffer overflows occur [Vocal]. In the case of VoIP and video conferencing, jitter can cause audio and video artifacts.

When it comes to jitter, packet loss and latency, QoS requirements and recommendations for VoIP are as follows [Cisco-1]:

  • Average one-way jitter should be targeted at less than 30ms. Research has shown that voice quality degrades significantly when jitter consistently exceeds 30ms.
  • Maximum packet loss is 1%
  • One-way latency (a.k.a., the mouth-to-ear latency) should be no more than 150ms

The ability to compensate for network jitter is one of the key factors impacting the overall quality of VoIP and video conferencing [Vocal]. This compensation is achieved using a jitter buffer. The jitter buffer adds directly to the end-to-end (mouth-to-ear) delay. As an example, a static jitter buffer of 100ms reduces the end-to-end delay budget by 100ms. One the one hand, setting too large a size for the jitter buffer may require the network to support a tighter delay target than may be necessary [Cisco-1]. On the other hand, a jitter buffer too small to accommodate the network jitter can result in buffer underflows (i.e., the buffer is empty when the codec needs to play out a sample) or overflows (i.e. the buffer becomes full and an arriving packet cannot be queued in it). If the jitter is so large that packets are received out of the range of the buffer, the out-of-range packets are discarded and dropouts (clipping) are heard in the audio.

As an example, let’s assume a static jitter buffer set to 100ms. This means that the first voice sample that is received when the jitter buffer is initially empty is held in the buffer for 100ms before it is sent to the codec for playout [Giralt]. A subsequent packet can be delayed as much as 100ms with respect to the first packet without loss of voice continuity. However, if a subsequent packet is delayed more than 100ms, there will be a dropout in the audio (unless packet loss concealment is performed – more about that below). If packets are received on average at a lower rate than the fixed interval at which they are fed to the codec, the codec will eventually starve (i.e., a buffer underflow occurs).  When packets arrive at a sufficient average rate, the jitter buffer will always have enough packets to play 100ms of audio before running out of packets. Thus, the variable delay (i.e., jitter) in the network can be up to 100ms without noticeable voice quality degradation.

As another example, if we have a network with a low average delay of 20ms, average jitter of 8ms, and an occasional maximum jitter of 60ms, the size of the jitter buffer should be at least 60ms (or perhaps slightly more) to compensate for the network jitter, and the overall mouth-to-ear delay would be 80ms [Giralt, Kularatna].

As a final example, Linksys devices use a minimum adaptive jitter buffer size of 30ms (or 10ms + current RTP frame size, whichever is larger) [Cisco-3].

A rule of thumb is that if the jitter level is over 100ms, increasing the size of the jitter buffer to avoid packet discards may introduce too large a delay and cause conversational problems (consider the above-mentioned recommend max mouth-to-ear latency of 150ms) [VTS-1]. According to [VTS-2], a typical jitter buffer configuration is 30-50ms in size. In the case of an adaptive jitter buffer, the maximum size may be set to 100-200ms.

Budgeting for jitter accurately (i.e., choosing an appropriate size for the jitter buffer) is difficult due to jitter’s dependency on the traffic mix, traffic burstiness, link utilization, and the nonadditive property of jitter (which means that jitter across a network path does not equal to the sum of jitters across consecutive parts of the path) [Joseph]. The nonadditive property of jitter should not be confused with the impact of several jitter buffers between the sender and the received on the mouth-to-ear latency – if there are for instance two cascaded mixers [RFC 4353] each having a static jitter buffer of 50ms, the total latency introduced by the jitter buffers is 100ms.

Above, static jitter buffers were covered. Many systems use adaptive jitter buffers that dynamically tune the size of the jitter buffer to the lowest acceptable value [Cisco-1] by continuously estimating the network delay and adjusting the playout delay at the beginning of each talkspurt [SS]. Adaptive jitter buffers [Cisco-1]:

  • Increase the size of the buffer to the current measured jitter value following a buffer overflow
  • Slowly decrease the buffer size when the measured jitter is less than the current buffer size
  • Use Packet Loss Concealment (PLC) to incorporate the loss of a packet on a buffer underflow. PLC is a technique used to mask the effects of lost or discarded VoIP packets. One simple method is to replay the latest received sample with increasing attenuation at each repeat. This can conceal the loss of up to 20ms of samples. More sophisticated PLC techniques can conceal up to 30-40ms of loss with tolerable quality.

An adaptive jitter buffer performs the playout adjustment during the silent periods between talkspurts [SS]. The adjustment is done on the first packet of the talkspurt. All packets in the same spurt are scheduled to play out at fixed intervals following the playout of the first packet.

The level at which jitter becomes noticeable depends on the media type. As an example, tolerable video jitter is larger than tolerable audio jitter [Jeffay]. This means that in video conferencing, the buffering delay for video is determined by the size of the audio jitter buffer (in video conferencing, the audio and video need to be synchronized to achieve lip sync). In the audio-video synchronization process, adaptive playout algorithms are performed first, and the video frames are played out on the playout times of their corresponding audio packets (the correspondence is determined by the timestamps of the video and audio packets) [SS]. To enable this, the video frames are stored in a video playout buffer and each frame is delayed until the corresponding audio packets are played out.

References


[Cisco-1] Quality of Service Design Overview, http://www.ciscopress.com/articles/article.asp?p=357102

[Cisco-2] Understanding Jitter in Packet Voice Networks (Cisco IOS Platforms), http://www.cisco.com/c/en/us/support/docs/voice/voice-quality/18902-jitter-packet-voice.html

[Cisco-3] What is the jitter buffer value in ms of Linksys devices? https://supportforums.cisco.com/discussion/11128491/what-jitter-buffer-value-ms-linksys-devices

[EOC] The Components of Motion-to-Photon Latency in Virtual Reality Systems, http://edge-of-cloud.blogspot.fi/2016/11/the-components-of-motion-to-photon.html

[Giralt] Giralt, Hallmark and Smith: Troubleshooting Cisco IP Telephony

[Jeffay] Jeffay and Zhang: Readings in Multimedia Computing and Networking

[Joseph] V. Joseph and B. Chapman: Deploying QoS for Cisco IP and Next Generation Networks: The Definitive Guide

[Kularatna] N. Kularatna: Essentials of Modern Telecommunications Systems

[RFC 4353] A Framework for Conferencing with the Session Initiation Protocol (SIP), https://tools.ietf.org/html/rfc4353

[SS] Adaptive Playout Buffering for Audio/Video Transmission over the Internet, https://pdfs.semanticscholar.org/acc0/c0b01e6a49c619c550fee77dea7f1778c518.pdf

[TT] Jitter, http://searchunifiedcommunications.techtarget.com/definition/jitter

[Vocal] Jitter Buffer for Voice over IP, https://www.vocal.com/voip/jitter-buffer-for-voice-over-ip/

[VTS-1] Problem: Jitter, http://www.voiptroubleshooter.com/problems/jitter.html

[VTS-2] Problem: Jitter buffer, http://www.voiptroubleshooter.com/problems/jitterbuffer.html