Those of you who’ve been following in-koherence know that I always keep power consumption in mind and frequently look into what’s driving my electricity bill. I recently went on another round of power misering at Koherence. After a bit of head-scratching, it turns out that running operating systems without tickless kernel support was an unexpected source of power drain.
This time the focus was the ESXi server. The machine is based on a Core i5-660, which is now about five years old. In that time significant strides have been made in reducing the power consumption of the processor (and the system board in general). So is it time to upgrade?
Few of the VMs that I run are particularly demanding, and all spend a majority of the day idling. As a result the Intel NUC 5I3RYH looks particularly attractive. It has low power consumption yet enough horsepower to run the VMs I have planned. I’ve also come to the conclusion that I’ve been attracted to rack-mounted systems for No Good Reason, and the NUC’s diminutive size could free up quite a bit of space in the server closet. These systems tend go for around $280, plus you need to add HDD and memory. So call it around $500 when all’s said and done.
The Hidden Cost of Timer Interrupts
Is that $500 justified? Looking at the WattsUp watt meter, with all VMs powered off, the 660-based ESXi server consumes roughly 31 watts. I powered on a couple VMs and the consumption jumped to 44 watts. Hmm. This seemed a bit high. So I powered on the machines independently and monitored the power usage. One of the VMs (Centos 6 with 4 vCPUs assigned to it) jumped the power by around 3-4 W. This seemed reasonable. Then I powered on the second VM (Centos 5 with one vCPU assigned to it). Power consumption spiked by 9 watts. I double checked the VM, and yes it was idle. Odd. One would think that this machine would be less demanding than the 4 vCPU one. Checking the virtual hardware configs showed no obvious differences.
To figure out what was going on, I took a look at the Performance tab of each VM via vSphere Client. The 4 vCPU CentOS 6 VM was humming along at around 8 MHz or so. The single vCPU CentOS 5 VM in contrast was cranking at 57 MHz. This seemed a bit strange. But clearly something about the CentOS 5 VM was preventing the i5-660 from idling long enough to enter the low power “C” states.
Some web browsing suggested that perhaps enabling ESXi’s more aggressive power management might help. So I switched from Balanced to Low Power mode. This had no appreciable effect. There are some excellent blogs and papers at https://blogs.vmware.com/performance/tag/low-power and https://www.vmware.com/files/pdf/hpm-perf-vsphere5.pdf. (At the moment I’ve left the power configuration at Low Power – power consumption in this particular ESXi instance is more important than performance.)
Tickless Kernel Power Savings
After further experimentation in making the virtual hardware identical between the CentOS 5 and 6 VMs, disabling services in the CentOS 5 VM, and upgrading the kernel, it occurred to me that one of the features that CentOS 6 brought with it was the tickless kernel. In a nutshell, in CentOS 5 the kernel’s timer fires 1000 times a second. Just to see if there’s anything to do. In many cases there may not be…but the CPU needs to wake up and check anyway. In CentOS 6, this timer-driven activity is eliminated and the system wakes up as needed in response to interrupts (i.e. when there is something to do). So the CPU can nap for longer periods. This has the side-benefit of allowing the CPU to enter deeper sleep modes, which are much more power efficient.
Now upgrading the CentOS 5 VM to CentOS 6 wasn’t something I had planned. In-place upgrades aren’t supported by CentOS (or theeir upstream provider Red Hat), so upgrading involves wiping the OS and reinstalling from scratch. But given a power savings of 5 watts 24/7 (120 watts a day), it’s certainly worth it. (I should add that I had at one time built a custom CentOS 5 kernel with a 100 Hz timer tick rather than 1000 Hz. While not as good as a tickles kernel, the 10x reduction in timer rate may have allowed the C1E and C2 sleep states. However I really didn’t want to go back to maintaining custom kernels.)
With the CentOS 5 VM now running CentOS 6, the Performance tab in vSphere Client shows CPU utilization at 4-5 MHz. Recall that this was on the order of 57 MHz when running CentOS 5. The i5-660 ESXi server with two VMs powered on (but idle) now consumes roughly 34 watts.
So for the moment, I’ll focus on upgrading the CentOS 5 VMs to CentOS 6 so they’ll play better with ESXi power management. (As to why I don’t go to CentOS 7…well, my experience there hasn’t been so great…) And at least for the time being Koherence will find some other use for that $500.