Benchmarking Xen Virtualization

image.png

Introduction to Xen Virtualization Types (PV, PVHVM, PVH)

Xen is an open-source baremetal hypervisor that is widely used by commercial and non-commercial platforms to provide virtualization support. However, unlike most other hypervisors, Xen supports multiple ways of virtualizing guests. Below is a brief history of the development of these modes and their relationships with one another:

Image from the Xen Wiki

Image from the Xen Wiki

Naturally, there are performance implications when selecting a virtualization mode. For example, newer modes like PVH are better able to take advantage of newer hardware virtualization features. The following graphic shows a breakdown of the different system components that are emulated by Xen, along with whether this occurs in hardware or software:

Image from the Xen Wiki

Image from the Xen Wiki

So we can see that PV mode does not make use of many hardware accelerated virtualization features, while PVH can use features like VT-d Posted-Interrupts and VT-x (on intel platforms) to improve performance. Other modes like PVHVM are somewhere in the middle, using some hardware acceleration but less-than-ideal software emulation for other components.

Note: PV can use some hardware features like EPT to improve paging performance (called HAP in Xen).

The goal of this research was to investigate how the different virtualization types affected performance under a number of benchmarks. Naturally, the expectation was that PVH would be the most performant, however that mode requires a relatively recent Linux kernel version and does not support some Xen features (like PCI-passthrough) that may be required for some users. It seemed useful, therefore, to investigate the actual difference in performance.

Historic Benchmarks

Of course, Xen is a widely used technology, and there have been previous efforts to do benchmarks like these. However many of these are very dated and do not include more recent modes like PVH. We ultimately decided to style most of these tests after the work that Rackspace did in 2013:

https://developer.rackspace.com/blog/welcome-to-performance-cloud-servers-have-some-benchmarks/

Their tests were largely comparing PV vs HVM, different cpu configurations (i.e., under and oversubscribed). These largely showed results in line with what we would expect. (PV)HVM makes more use of hardware acceleration than PV, and so it has generally higher performance across the various benchmarks Rackspace ran.

Test Setup

Our benchmarks are similar to Rackspace’s, but use more recent versions of Xen and the Linux kernel. This gives us the ability to test newer virtualization modes like PVH. The test environment is as follows:

  • Test platform: Supermicro 5018D-FN4T

    • Intel Xeon D-1541

    • 16GB RAM

  • Linux Distribution: Debian Buster (kernel version 4.19) for both dom0 and the guests

  • Xen version: 4.11 installed from the standard Debian packages

  • Other Xen configuration:

    • Cores are pinned and the NULL scheduler is used. This is done for consistency of test results.

    • Cores 0 and 1 are reserved for dom0

    • 1GB of RAM is reserved for dom0, no ballooning

    • Disk for guest is a SSD passed through as a raw device

  • All tests are performed 5 times to get an average and std deviation

Creating the guests

Guests were created using a standard xen-image-create command like:

xen-image-create --dist=buster --dir=/etc/xen \
--hostname=testvm --dhcp --password=password --noswap \
--memory=14G

This same guest configuration was used for the PV, PVHVM, and PVH tests, but the ‘type’ was manually changed for each test, to use the appropriate virtualization mode.

Benchmarks

We used essentially the same benchmarks as the Rackspace test. Each benchmark was run 5 times. The average of the 5 iterations is reported in the results section.

  • To test disk I/O, fio was used as follows:

    fio --name fio_test_file --direct=1 --rw=randwrite \
    --bs=16k --size=1G --numjobs=16 --time_based --runtime=180 --group_reporting
    
  • Most general system performance was measured using unixbench, which has several benchmarks that tests the performance of syscalls, floating point operations, pipe throughput and more. It can be setup by cloning the unixbench repository and running make && ./Run from the UnixBench directory.

  • Linux kernel (4.19) compilation time was also benchmarked to test combined CPU and I/O performance. This compilation was executed with: make -j$(cat /proc/cpuinfo | grep processor | wc -l)

Test Results

Disk I/O

image.png

Unfortunately, these first results don’t provide much insight. It seems very unlikely that native performance is lower than the virtualized case. It is likely that we are completely saturating the disk with every virtualization method.

Unixbench

image.png

Here we see some other strange results. The first few benchmarks are as expected however, with the virtualization methods all having almost identical scores. This is reasonable because ‘Wetstone’ and ‘Drystone’ are measuring numerical performance, which should be unaffected by the virtualization method used. In later benchmarks, though, we see that PVHVM gets slightly better scores than PVH in almost every benchmark. This is the opposite of what we would expect.

There are a few possible explanations for these results. One candidate is the nature of where the emulation happens when using PVH vs PVHVM. In the latter mode, the device model (QEMU in this case) is executing in Dom0. The former mode executes emulation in the hypervisor itself. Because we are using the ‘null’ scheduler, Dom0 always has cores available to execute on. So when in PVHVM mode, the hypervisor has relatively little work to do. PVH mode, on the other hand, will require more time spent in the hypervisor (and therefore less in the guests). This could account for the comparatively worse performance.

Kernel Compilation

image.png

The kernel compilation results are aligned with the previous ones. PV performs significantly worse than PVH and PVHVM. However, this is a more realistic workload and shows the significant benefit of moving from PV to PVHVM (or PVH).

Conclusions

This has been a quick look at the performance for the various Xen virtualization modes. Of course, these results may not apply to your specific workload. Xen makes it very easy to switch between modes, so it is typically best to test in each mode and use the one that is best for you.

Headline image courtesy of the Xen Project.