Vertica on EC2 performance optimizations

3 minute read Published:

Disabling hyperthreading, upgrading the kernel, and the noop scheduler for SSDs on EC2 instances with CentOS/RHEL 7.

Performance

Some basic things you can do to improve EC2 performance. This is following some performance tweaking experiments done for Vertica on AWS.

Upgrade kernel on CentOS or RHEL 7

You can put this in your Packer shell scripts somewhere. Run it as root:

yum -y install epel-release
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
yum -y --enablerepo=elrepo-kernel install kernel-ml
grub2-set-default 0

With this you can install the Linux kernel 4.14 (as of today) and newer, which brings some performance benefits.

Disable hyperthreading

Here’s an article from AWS on how to disable hyperthreading on EC2.

The reason why is this Localytics article which recommends disabling HT for Vertica.

The code is as follows:

#!/usr/bin/sh

for cpunum in $(cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | cut -s -d, -f2- | tr ',' '\n' | sort -un)
do
    echo 0 > /sys/devices/system/cpu/cpu$cpunum/online
done

Now if you want this to happen at every boot, you can put it in cloud-init’s per-boot directory.

Here’s an excerpt from the EC2 userdata which creates the per-boot script disable_ht.sh and executes it:

echo '#!/usr/bin/env bash

for cpunum in $(cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | cut -s -d, -f2- | tr ',' '\n' | sort -un)
do
    echo 0 > /sys/devices/system/cpu/cpu$cpunum/online
done
' > /var/lib/cloud/scripts/per-boot/disable_ht.sh

chmod u+x /var/lib/cloud/scripts/per-boot/disable_ht.sh
/var/lib/cloud/scripts/per-boot/disable_ht.sh

Noop scheduler

The Vertica installer was complaining to me about my disks not being set to the noop scheduler. From the Vertica docs.

Vertica requires that I/O Scheduling be set to deadline or noop.

Here’s the error from the installer:

HINT (S0151): https://my.vertica.com/docs/7.2.x/HTML/index.htm#cshid=S0151
These disks do not have known IO schedulers: '/dev/md0' ('md0') = 'none'

I set off to find the scheduler of my attached EBS SSDs (a mix of gp2 and io1).

cat /sys/block/<device>/queue/scheduler

According to most documentation and resources you’ll find online (e.g. on Stackoverflow or Serverfault), if you check the scheduler of your SSDs, you should see the ofllowing options:

$ cat /sys/block/xvda/queue/scheduler
[noop] deadline cfq

Here’s what I actually see on an EC2 instance with SSDs and a 4.14 kernel:

$ cat /sys/block/xvda/queue/scheduler
[none]

Now, here’s some reading that describes that none used to be an alias for noop but this is no longer the case for modern (3.13+) kernels.

echo noop > /sys/block/<device>/queue/scheduler

I figured, full steam ahead, why not try it anyway, right?

$ echo noop > /sys/block/xvda/queue/scheduler
$ cat /sys/block/xvda/queue/scheduler
[none]

Didn’t help.

elevator=noop

This is a boot-level parameter for GRUB which sets noop to be the default scheduler for all disks:

sed -i 's/GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200 net.ifnames=0"/GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200 net.ifnames=0 elevator=noop"/g' /etc/default/grub		

grub2-mkconfig -o /boot/grub2/grub.cfg

Dmesg output showing the success:

$ dmesg | grep noop
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.14.0-1.el7.elrepo.x86_64 root=UUID=0a84de8e-5bfe-43e7-992b-5bfff8cdce43 ro console=tty0 crashkernel=auto console=ttyS0,115200 net.ifnames=0 elevator=noop
[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.14.0-1.el7.elrepo.x86_64 root=UUID=0a84de8e-5bfe-43e7-992b-5bfff8cdce43 ro console=tty0 crashkernel=auto console=ttyS0,115200 net.ifnames=0 elevator=noop
[    1.831891] io scheduler noop registered (default)

Still no dice:

$ cat /sys/block/xvda/queue/scheduler
[none]

blk-mq

I think the eventual right answer that I stumbled upon is that newer kernels use a different I/O scheduler, called blk-mq. This scheduler means that the old settings (i.e. noop deadline cfq) are no longer valid.

Performance benchmarks showed no issues so from now on I’ll just ignore that Vertica warning.

Final note on mdadm

If you noticed, the first warning pasted was about a RAID disk created by mdadm called /dev/md0. Turns out, to set the scheduler of a RAID array, you should set it for the underlying disks. That’s when I encountered the above problems and solutions.