The graphs below show the result of some MariaDB benchmarks. These are the questions I wanted to answer:
- What is the performance penalty (if any) for running in a VM or LXC container instead of on bare metal?
- How does ZFS perform compared to a standard ext4 filesystem?
- With ZFS, does having a SLOG help (given that the pool is all SSDs)?
- Does an NVMe SLOG perform better than a SATA SLOG?
I ran the tests using sysbench. There are 10 different tests that do a variety of SQL operations. I ran them with a table-size of 1 million rows. The exact same mariadb config file was used in each test (copied from a server tuned to host hundreds of WordPress databases). There were 7 different hardware/OS configurations (A-G):
- Config A: Debian on bare metal, ext4 fs
- Config B: Debian VM, ext4 on zfs zvol, no SLOG
- Config C: Debian VM, ext4 on zfs zvol, SATA SLOG
- Config D: Debian VM, ext4 on zfs zvol, NVMe SLOG
- Config E: Debian LXC container, zfs filesystem, no SLOG
- Config F: Debian LXC container, zfs filesystem, SATA SLOG
- Config G: Debian LXC container, zfs filesystem, NVMe SLOG
- Config H: Debian LXC, ZFS, no SLOG, LSI HBA
- Config I: Debian LXC, ZFS, SATA SLOG, LSI HBA
- Config J: Debian LXC, ZFS, NVMe SLOG, LSI HBA
For each test, there are 2 results of interest: Queries per second, and average latency. There are 10 different tests broken up into 2 charts (5 tests per chart), for each of QPS and Latency, for a total 4 charts.
The same physical machine was used for all the tests, an HP Gen8 server with two 8-core E5 CPUs, 128 GB RAM. The OS and mysql drives are Samsung 860 500GB. The SATA SLOG drives are 64GB partitions of a different pair of Samsung 860 500GB drives. The NVMe SLOG drives are 64GB partitions of a pair of Samsung 970 250GB drives.
For the bare-metal test, mysqld had access to all 16 cores and 128GB of RAM, and the boot/mysql disks were in a RAID-1 config using the HP 420i RAID controller. The VM and LXC tests were run on the same Proxmox install. The guest OS was assigned 12 cores and 96GB of RAM, and the boot/mysql disks were in a ZFS mirror, with the RAID card passing through disk devices as JBODs.
Conclusions are below, after the charts.
Looking at QPS, configs A and G are generally the best. A is bare-metal and G is an LXC container with the NVMe SLOG. And third place is usually config D, a full VM using the NVMe SLOG. My take on that is that the overhead of a full VM with ext4 on top of a ZFS zvol introduces a lot of overhead – enough to cut performance in half.
In some of the tests, the 3 LCX container configs all performed about the same. But in others, the NVMe SLOG makes a huge difference. My assumption is the latter tests are ones that require many syncronous writes.
The latency results are a little less clear. There are some tests that the bare-metal config performed worse on (oltp_read_only, select_random_points) and I don’t have any theories on why. Overall the LXC container configs performed better or only-a-little-worse than the others.
The bottom line is the bare-metal config and the LXC/NVMe config both perform the best and we should use one of them.
The main Pro of the bare-metal config is simplicity. There are fewer moving parts, and we don’t need the extra (non-hot-swappable!) NVMe drives.
The advantage of the LXC/NVMe solution is flexibility. Using Proxmox, we can easily roll back a failed OS or MariaDB update, which is very valuable. We just have to decide if it’s worth it to have the non-hot-swappable SLOG drives – if one fails it will require a service outage to replace it.
I’m leaning toward the LXC/NVMe config.