Ensuring Performance for Real-time Media Packet Processing in OpenStack – Part 2
As I pointed out in my last blog, there are two aspects to address in order to deliver high performance real-time media packet processing in a virtual, Cloud deployment. This blog will address the 2nd aspect - ensuring deterministic behavior of real-time media traffic with no packet loss.
One of the benefits of a virtualized environment is a sharing of physical resources across all VMs. Yet this also becomes a drawback for compute-intensive VNFs, like those associated with high performance, real-time media packet handling. The reason this becomes a drawback is because non-deterministic behavior can lead to packet loss or high latency which negatively impact quality. For compute-intensive VNFs, it is desirable to be platform “aware” in order to leverage the underlying hardware acceleration.
Based on our analysis and testing, we have concluded there are a set of recommended OpenStack configurations that positively contribute to ensuring that real-time media packets are handled in a deterministic manner. They are as follows:
Overcommit settings. Override OpenStack’s default overcommit settings, and ensure a 1:1 ratio for CPU and RAM.
NUMA placement. Although each NUMA node will have dedicated bus to memory that is local to the node, it is also possible to access remote RAM across a bus that is shared with all nodes. When accessing remote RAM on shared bus or using a NIC that is not local to the host, is it is possible to run into problems such as: unintended cache synchronization among the NUMA nodes; I/O performance degradation; or a waste of resources in the shared intra-node bus. To avoid these drawbacks, place guests such that they are entirely confined within a single non-uniform memory access (NUMA) node and ensure spawning of instances with I/O device locality awareness.
vCPU topology. By default, in OpenStack, the virtual CPUs in a guest are allocated as standalone processors. However, this is not necessarily valuable for all applications. For our specific application, hypervisor overhead can be reduced if the host topology is also configured in the guest.
CPU pinning. From the hypervisor's perspective, the virtual machine appears as a single process which needs to be scheduled on the available CPUs. While the NUMA configuration above means that memory access from the processor will tend to be local, it is possible for the hypervisor to then choose to place the next scheduled clock tick on a different processor. Unfortunately, this can lead to less memory locality and increases the possibility of cache misses, thus a guest VM might starve for scheduling even if the VM is ready to get scheduled.
Therefore, our recommendation is to have the guest vCPUs pinned to the host CPUs and dedicated to it, so other VMs on the same host cannot use them. Compute resources should host either only pinned VMs or unpinned VMS, but not a mix of the two. And if hyper-threading is enabled on the host retain the default thread policy.
Specifying the use of large pages. Use the Linux kernel feature "transparent huge pages" (THP) and to explicitly allocate huge pages to application RAM allocations when practical to do so. The allocation of huge pages reduces the requirement of page allocation at runtime being dependent on the memory usage, reduces the hypervisor overhead, and VMs can get the RAM allocation that will boost their performances.
Isolation of CPUs for host processes. In order to have all the pinned CPU dedicated for the guest vCPU and no interferences with the host level processes, it is always better to isolate the CPUs to be used by the VMs.
Turn-off / non-usage of settings. There are several OpenStack settings that need to be turned-off or otherwise not used: Get better memory access and ensure low latency in memory access, by NOT using Swap memory on the compute host.
- When using the large page option, a RAM over-commit ratio is no longer used, because there must be a 1-1 mapping between guest RAM and host RAM, and the host OS won't consider any large pages allocated to the guest for swapping. In conjunction, we believe it makes sense to turn off over-commit for vCPUs too.
- Kernel shared memory (KSM) can provide significant improvements in the utilization of RAM when many identical guest OS are run on the same host, or guests otherwise have identical memory page contents. However, the cost of KSM is increased CPU usage and this overhead may cause the applications to run more slowly, so we recommend turning off KSM.
- CPU model setting for the VM. One option in the Nova configuration allows the model of CPU visible in the guest to be configured between different alternatives. For our compute-intensive application, we recommend the host BIOS power setting to be configured for maximum performance and the CPU mode to be set to “host-pass through” or “host-model”, so that all the CPU flags are exposed from the host to the guest.
As we continue testing in labs and analyze results of these recommendations in real-world POCs we look forward to sharing more information with the OpenStack community and to provide innovative solutions to ensure performance at scale for media packet handling in our customer’s virtual, Cloud networks.
MathWorks estimates that through the automated provisioning and call routing features of the Sonus solution, the company has freed up more than 250 IT staff hours per week for more important projects.MathWorks is the leading developer of mathematical computing software for engineers and scientists. Founded in 1984, MathWorks employs 2800 people in 15 countries, with headquarters in Natick, Massachusetts, U.S.A.
The industry-leading performance and scale of Sonus' SBC 5100 allows us to maintain a competitive edge in the market while delivering exceptional customer service.Smart Tel is a major player in the Singapore telecommunications industry and aims to develop its global presence with new offices in Australia, Thailand, Indonesia, Philippines, India, South Africa, the US and the UK, with cost effective, easy-to-use and scalable telephony solutions.
We wanted to work with an industry-leading SBC vendor and our market analysis indicated that Sonus was the clear choice for this partnership.(GCS) is a software company founded in 2006 by Neal Axelrad and Jay Meranchik. GCS' goal is to be the best company in the marketplace. We are privately held and have offices in New York & New Jersey USA.
Sonus made the deployment, integration and migration to Microsoft Lync easy.We are experts in identifying and delivering flexible communication solutions that scale and adapt to your business demands, empowering your business to do more, faster and with less effort and cost.