Dell servers power management settings in BIOS and performance impact

March 1, 2011

Ok so it all started like this:

Bunch of web servers, all the same model (Dell R710), all functioning properly – however two of them are faster then the others (40% lower response time AND lower load average).
Average connections to all the machines was identical.

  • Hardware: Identical
  • Software: Identical (to eliminate this as an issue, a disk was removed from the “fast” server RAID and placed in the “slow” server as a primary drive, and the array was rebuilt with the exact same data)

So it wasn’t a hardware issue (as far as the specs and components could show) and it wasn’t a software issue (same applications and code were the same on both machines).

Just a 40% performance discrepancy…

After exhausting many other possible causes, the fast server had to be rebooted for some reason or another, and lo and behold: After the restart, the fast server became slow like the others!
As a result, we started to surmise that the issue was related to some daemon that loads on boot (even though chkconfig –list was identical of course on both machines).

To verify this assumption, I ran the following on each server:

for i in /etc/init.d/*; do echo $i status: ; /etc/init.d/$i status; done > ~/daemon_status_list

scp’d over the output to the slow server from the fast:

scp slowserver.com:/root/daemon_status_list /root/slowserver_daemon_status_list

Diffed the output:

diff daemon_status_list slowserver_daemon_status_list | less

and got something like this:

< acpid (pid 4316) is running...
---
> acpid (pid 4257) is running...
6c6
< atd (pid 4658) is running...
---
> atd (pid 4597) is running...
8c8
< auditd (pid 4032) is running...
---
> auditd (pid 3976) is running...
10c10
< automount (pid 4282) is running...
---
> automount (pid 4228) is running...
18c18
< Frequency scaling enabled using ondemand governor
---
> cpuspeed is stopped
20c20
< crond (pid 4601) is running...
---
> crond (pid 4540) is running...
22,24c22,24
< dsm_om_connsvcd (pid 29810 29809) is running...
---
> dsm_om_connsvcd (pid 13244 13243) is running...
31c31
< dsm_om_shrsvcd (pid 29773) is running...
---
> dsm_om_shrsvcd (pid 13190) is running...
39c39
< gpm (pid 4586) is running...
---
> gpm (pid 4525) is running...
...
... *snip*
...
< saslauthd (pid 4677 4676 4675 4674 4673) is running...
---
> saslauthd (pid 4616 4615 4614 4613 4612) is running...
165c165
< sfcbd (pid 4471 4468 4384 4239 4236 4221) is running...
---
> sfcbd (pid 4328 4325 4189 4184 4182 4165) is running...
168c168
< smartd (pid 5780) is running...
---
> smartd (pid 4855) is running...
170c170
< snmpd (pid 4332) is running...
---
> snmpd (pid 4273) is running...
174c174
< openssh-daemon (pid  4353) is running...
---
> openssh-daemon (pid  4294) is running...
176,177c176,177
< syslogd (pid 13113) is running...
< klogd (pid 13116) is running...
---
> syslogd (pid 4011) is running...
> klogd (pid 4014) is running...
180c180
< xfs (pid 4629) is running...
---
> xfs (pid 4566) is running...

In case you missed it:

< Frequency scaling enabled using ondemand governor
---
> cpuspeed is stopped
20c20

!!!

This was strange indeed. Not only should there not be any sort of difference on the machines – I would expect such a process to be detrimental to the performance, as the cpuspeed is a power saving application, made to throttle the CPU speed and fans and such.

Just to try however, when attempting to start the daemon, nothing happened (i.e. /etc/init.d/cpuspeed start just did nothing – nothing in the logs as well).

Upon running the command manually this was the output:

shell$>cpuspeed
Error: Could not open file for writing: /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
Error: No such file or directory
Error: Could not open file for writing: /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
Error: Error: No such file or directory
Could not open file for writing: /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
Error: No such file or directory
Error: Could not open file for writing: /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
Error: No such file or directory
Error: Error: Could not open file for writing: /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
Could not open file for writing: /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
Error: No such file or directory
Error: No such file or directory
Error: Could not open file for writing: /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
Error: Error: No such file or directory
Could not open file for writing: /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
Error: No such file or directory
Error: Could not open file for writing: /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
Error: No such file or directory
Error: Could not open file for writing: /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
Error: Error: No such file or directory
Could not open file for writing: /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
Error: No such file or directory
Error: Could not open file for writing: /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
Error: No such file or directory
Error: Error: Could not open file for writing: /sys/devices/system/cpu/cpu13/cpufreq/scaling_governorCould not open file for writing: /sys/devices/system/cpu/cpu14/cpufreq/scaling_governorError:

Could not open file for writing: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
Error: Error: No such file or directory
Error: No such file or directory
Could not open file for writing: /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
Error: No such file or directory
Error: No such file or directory

This I did not like.

From here, a little googling and some intuition led me to:

lsmod | grep cpu

Which gave me nothing, whereas on the fast server I got:

cpufreq_ondemand       42449  16
acpi_cpufreq           47937  1
freq_table             40889  2 cpufreq_ondemand,acpi_cpufreq

cpufreq_ondemand loaded ok with modprobe, but acpi_cpufreq got this:

shell$>modprobe acpi_cpufreq
FATAL: Error inserting acpi_cpufreq (/lib/modules/2.6.18-164.15.1.el5/kernel/arch/x86_64/kernel/cpufreq/acpi-cpufreq.ko): No such device

Now I was getting somewhere…

Long story short (too late?)…

Turns out some Dell servers ship with “green” config settings in the BIOS. The relevant setting in this case is simply called “Power Management” and it has four possible options:

  • Static MAX Performance
  • OS Control
  • Active Power Controller
  • Custom

Instead of detailing the meaning of each (already explained in the link above) – I will reiterate the list in a different fashion:

  • Static MAX Performance – OK to use; kernel module still won’t load and linux will have no way to ever control the CPUs and fans
  • OS Control – This was the setting that was enabled it turns out on the “fast” server – hence OK
  • Active Power Controller NEVER USE THIS
  • Custom Can be good, can be bad – depends on you

Unless your host your servers in a DC powered by gerbils, I do not see any reason to purposely cause your servers to run beneath their maximum capacity (yes less power consumption in off-peak hours is very sexy, but not worth the trade off if the lovely BIOS algorithms decide that your peak hours don’t justify giving the CPUs some extra juice).

FYI

4 Comments to "Dell servers power management settings in BIOS and performance impact"

  1. Robert wrote:

    Thanks for confirming this much overlooked aspect of server performance.

    After 2 weeks of research and testing, we have found the same, running on Ubuntu 11.04 and a Dell R815 with 4 x AMD 6168′s and 64G Ram.

    Benchmarking this, the difference between the “Green” setting and max performance is huge, a factor of 2.5 to 3X!!

    We have found that even the “OS Control” setting provides significantly less performance. The static “Max Performance” setting is by far the best, about 2x faster than the OS Control.

    We have now select the “Custom” setting with the CPU and the Memory set to “Max Performance” and the Fan control setting to “Minimum Performance” in order to keep the noisy fans down spinning not so fast. We have found there is no performance degradation whatsoever running it this way. We do have this server in a temperature controlled room, so overheating should not be a problem.

    Again, thanks for this article. We have found precious little about this, and yet it can make a difference between a server that runs 3x as slow as it could. And that is no small change!!

  2. tom wrote:

    Robert Hi!

    So glad you found the post helpful – good to hear that you were able to improve your performance thanks to the BIOS setting.
    It really is ridiculous that that should be the default setting :)

    Thanks!

  3. Benjamin wrote:

    Thanks for the excellent article, we have observed the same observation on our m1000e bladecenter stacked with 16 m610′s which were all performing at half their capacity(cpu intensive applications on all of them).

    I then hit this article after the fact when I was searching for a good reason why Dell would believe this default is sensible.

    Your tldr explained it nicely :-)

    Cheers.

  4. tom wrote:

    Thanks! Glad it could be of help!
    The whole thing is wonky to begin with, I agree.

Leave Your Comment

 
Powered by Wordpress. Theme by Shlomi Noach, openark.org
Hosted by Evolution On-line