Please refer to nfv-kvm-test for details.
A right configuration is critical for improving the NFV performance/latency. Even working on the same codebase, different configrations can make completely different performance/latency result.
There are many combinations of configurations, from hardware configuration to Operating System configuration and application level configuration. And there is no one simple configuration that works for every case. To tune a specific scenario, it's important to know the behaviors of different configurations and their impact.
Some hardware features can be configured through firmware interface(like BIOS) but others may not be configurable (e.g. SMI on most platforms).
A softirq is raised even when the active timer queue is empty which causes lots of context switches. In our case there is no timer user at present so the optimization can help us to count down the latency. See code change.
Threaded irq can help reduce interrupt latency because it avoids locking interrupt too long in interrupt handler. But if the interrupt handler itself does not take much time just like vfio for which the only thing to do is inject the interrupt to guest which can be really fast. In such case threaded irq would cost time to do the context switch between irq thread and interrupt handler. Another point is in NFV scenario such realtime interrupt(like DPDK interrupt) is almost the highest priority, so making such interrupt non-threaded would certainly benefit the highest application. See code change.
Last leve cache(LLC) contention is a key resource contention for memory intensive workloads running on the same socket. Intel CAT can be used to partition LLC among realtime/non-realtime apps/VMs.