Page MenuHomePhabricator

CPU scaling governor audit
Open, HighPublic

Description

The problem

As discovered by @faidon in T210723: Address recurrent service check time out for "HP RAID" on swift backend hosts at least on ms-be HP hosts the scaling governor ondemand isn't performing very well, namely load average is high and reported cpu utilization % is also high.

On further research the problem is that default bios settings for power control on HP Gen9 ("dynamic") leads linux to loading pcc-cpufreq driver, which doesn't scale with > 4 CPUs and the ondemand governor. Using "os control" for power settings lets linux fully control of scaling, the end result being that intel_pstate driver is loaded and powersave is the default governor. This configuration also matches what happens both on Dell and HP Gen10 for the rest of the fleet (see below for a full audit)

The fix

Issuing set /system1/oemhp_power1 oemhp_powerreg=os from ilo ssh on HP Gen9 hosts and rebooting will switch to intel_pstate driver + powersave governor.

When a reboot is invasive/time consuming (e.g. database hosts) a temporary fix is to set the governor to performance (setting powersave isn't possible, the governors available without a reboot are ondemand performance schedutil) and change the ilo settings. On the next reboot then powersave will get loaded. While temporary, the fix should get pretty close to a preview on what's going to happen in terms of cpu utilization on next reboot.

performance vs powersave

We are forcing some hosts to use performance governor via puppet class cpufrequtils (e.g. lvs/cp), choosing between performance and powersave for a particular class of hosts is outside the scope of this task though, the goal here is to get the fleet to a standard baseline (i.e. intel_pstate + powersave).

Audit

Fleetwide audit below (Dell + powersave + intel_pstate skipped, since that's the desired/default state already)

Dell

cumin -b100 'F:virtual ~ physical and F:manufacturer ~ Dell' 'cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor || true'

ondemand

eeden.wikimedia.org: old host in esams, unused
labsdb[1006-1007].eqiad.wmnet: the acpi_cpufreq module has been loaded, I'm guessing depending on bios settings. Hosts are being decom'd in T220144 so we can let them be.

No governor

bast3002.wikimedia.org,cp1008.wikimedia.org,db2114.codfw.wmnet,db1138.eqiad.wmnet,dbproxy2001.codfw.wmnet,dbpro
xy[1001-1011].eqiad.wmnet,dns1002.wikimedia.org,es[2001-2004].codfw.wmnet,helium.eqiad.wmnet,iron.wikimedia.org,labstore[2001-2004].codfw.wmnet,lvs[1001-1006].wikimedia.org,maerlant.wikimedia.org,multatuli.wikimedia.org,nescio.wikimedia.org,rhenium.wikimedia.org,rhodium.eqiad.wmnet,tungsten.eqiad.wmnet

perhaps disabled via bios settings, will need to be audited

performance

cp[2001-2002,2004-2008,2010-2014,2016-2020,2022-2026].codfw.wmnet,cp[1075-1090].eqiad.wmnet,cp[5001-5012].eqsin.wmnet,cp[3030,3032-3036,3038-3047,3049].esams.wmnet,cp[4021-4032].ulsfo.wmnet
lvs[1013-1016].eqiad.wmnet,lvs[5001-5003].eqsin.wmnet,lvs[3001-3004].esams.wmnet,lvs[4005-4007].ulsfo.wmnet

expected

analytics1070.eqiad.wmnet,kafka-main[2001-2003].codfw.wmnet,labstore[1004-1005].eqiad.wmnet
manually set for tests or due to bios settings

HP

cumin -b100 'F:virtual ~ physical and F:manufacturer ~ HP' 'cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor || true'

powersave

db[2097-2102].codfw.wmnet,db[1139-1140].eqiad.wmnet DL360 Gen 10, looks like this generation already works out of the box (i.e. intel_pstate is the driver) even when power control is set to dynamic in the bios.

labsdb1012.eqiad.wmnet ditto as above, host is Gen 10 but DL380 not DL360 (and default settings, i.e. power control is dynamic)

ms-be2037.codfw.wmnet DL380 Gen9 but fixed bios settings as part of this task to be "os control"

No governor

mc[1022,1031].eqiad.wmnet likely due to bios settings?

performance

lvs[2001-2006].codfw.wmnet expected

ms-be[2016,2031,2033,2034-2035,2038].codfw.wmnet,ms-be1036.eqiad.wmnet due to tests, will be fixed with bios settings + reboot

ondemand

Will need to be fixed via bios settings (i.e. set /system1/oemhp_power1 oemhp_powerreg=os from ilo over ssh) and reboot.

If reboot is problematic or requires coordination (e.g. databases) then setting the governor to performance via for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i ; done will get similar performance (higher) until the next reboot when powersave will be used instead.

  • aqs[1004-1006].eqiad.wmnet
  • cloudcontrol2003-dev.wikimedia.org,cloudcontrol1004.wikimedia.org,cloudweb2001-dev.wikimedia.org
  • cloudcontrol1003.wikimedia.org
  • clouddb2001-dev.codfw.wmnet
  • cloudnet2002-dev.codfw.wmnet,cloudnet[1003-1004].eqiad.wmnet
  • cloudservices2002-dev.wikimedia.org
  • cloudservices1003.wikimedia.org
  • cloudvirt[1001-1009,1012-1013,1019-1020].eqiad.wmnet
  • cloudvirt1014.eqiad.wmnet
  • conf[1004-1006].eqiad.wmnet
  • db[1074-1095].eqiad.wmnet,db[2043-2063,2065-2070].codfw.wmnet (db[2034-2038,2040-2042].codfw.wmnet T221533, dbstore[2001-2002].codfw.wmnet T220002 are to be decom., do that instead)
  • druid[1001-1003].eqiad.wmnet
  • elastic1041.eqiad.wmnet,elastic[1032-1040,1042-1052].eqiad.wmnet,elastic[2025-2036].codfw.wmnet
  • labmon[1001-1002].eqiad.wmnet
  • labpuppetmaster[1001-1002].wikimedia.org
  • labsdb[1009-1011].eqiad.wmnet
  • labstore[1006-1007].wikimedia.org
  • labtestpuppetmaster2001.wikimedia.org,labtestservices2003.wikimedia.org,labtestvirt2003.codfw.wmnet
  • maps2002.codfw.wmnet,maps[1001-1004].eqiad.wmnet,maps[2001,2003-2004].codfw.wmnet
  • mc[1019-1021,1023-1030,1032-1036].eqiad.wmnet,mc[2019-2036].codfw.wmnet
  • ms-be[1016-1035,1037-1039].eqiad.wmnet
  • mwmaint2001.codfw.wmnet
  • netmon2001.wikimedia.org
  • oresrdb2002.codfw.wmnet
  • rdb[2005-2006].codfw.wmnet
  • relforge[1001-1002].eqiad.wmnet
  • restbase2009.codfw.wmnet,restbase[1010-1015].eqiad.wmnet
  • restbase-dev[1004-1006].eqiad.wmnet
  • snapshot[1005-1007].eqiad.wmnet
  • stat1006.eqiad.wmnet
  • wdqs2003.codfw.wmnet,wdqs1003.eqiad.wmnet
  • wezen.codfw.wmnet

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
faidon renamed this task from CPU scaling governor on ms-be hosts to CPU scaling governor on HP Gen9 hosts.Jun 13 2019, 12:21 PM
ArielGlenn triaged this task as High priority.Jun 14 2019, 7:09 AM
faidon renamed this task from CPU scaling governor on HP Gen9 hosts to CPU scaling governor audit.Jun 15 2019, 1:09 PM

So, I think there are two distinct problems discovered in the past few days

  • ondemand results into some really poor performance on the ms-be boxes. Going from 50% CPU util to 5% with a ondemand->performance switch probably means that this CPU scaling is not really scaling... on demand :) This may be specific to the workload of ms-bes, potentially affected by Meltdown/Spectre firmware updates, and/or it could be specific to HP hardware (or a subgeneration of it, like HP Gen9). These things tend generally depend on the firmware, but note also that HPs use the pcc_cpufreq Linux module, unlike all other systems.
  • A lot of systems seem to have the governor set to powersave, which may result into poor performance, depending on the workload.

Our choice of governor seems to be entirely inconsistent, and with the exception of the cp/lvs hosts (which are set to performance in Puppet), almost random:

DellHP
ondemand3240
powersave8669
performance9614
(not set to "OS control")396
jijiki added a subscriber: jijiki.Jun 16 2019, 9:11 AM

Mentioned in SAL (#wikimedia-operations) [2019-06-17T09:13:18Z] <_joe_> setting cpufreq governor to "ondemand" on mw1348, T225713

Joe added a subscriber: Joe.Jun 17 2019, 9:16 AM

Moving from "powersave" to "performance" slightly reduced the CPU load on one api application server (a 10-20% reduction in cpu usage) at the cost of significantly higher temperatures.

Joe added a comment.Jun 17 2019, 9:17 AM

Also please note that newer Intel CPUs don't have the ondemand governor on newer kernels, so to all effects the powersave governor is what ondemand used to be.

elukey added a subscriber: elukey.Jun 17 2019, 9:40 AM

Just executed echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor (hope it is the right way) on analytics1070 (hadoop worker). Since the workload varies a lot on these nodes, I'll wait a day before reporting results. I am super interested in trying to see if this could bring any positive effects to the analytics nodes.

herron added a subscriber: herron.Jun 20 2019, 1:56 PM
fgiunchedi edited projects, added User-fgiunchedi; removed media-storage.
fgiunchedi moved this task from Backlog to Doing on the User-fgiunchedi board.Jun 25 2019, 12:49 PM
fgiunchedi added a comment.EditedJul 4 2019, 10:36 AM

A few notes I gathered while comparing an Dell system (ms-be2049) that runs powersave and AFAICT has no performance issues (i.e. low reported cpu load, ~10%) with an HP system (ms-be2037) which boots with ondemand governor by default and I switched to performance governor on June 14th at 9:46.

Dashboards (during the governor switch of ms-be2037)

Both hosts run the same kernel 4.9.0-9-amd64 but have different non-performance governors available (the hp host by virtue of having loaded pcc_cpufreq I am assuming)

ms-be2049:~$ cat /sys/devices/system/cpu/cpufreq/policy0/scaling_available_governors
performance powersave
ms-be2037:~$ cat /sys/devices/system/cpu/cpufreq/policy0/scaling_available_governors
ondemand performance schedutil

In terms of vendor firmware/bios:

System Information
        Manufacturer: HP
        Product Name: ProLiant DL380 Gen9
        Version: Not Specified
BIOS Information
        Vendor: HP
        Version: P89
        Release Date: 09/13/2016
        BIOS Revision: 2.30
        Firmware Revision: 2.50
BIOS Information
        Vendor: Dell Inc.
        Version: 1.5.4
        Release Date: 07/30/2018
        BIOS Revision: 1.5
System Information
        Manufacturer: Dell Inc.
        Product Name: PowerEdge R740xd

Mentioned in SAL (#wikimedia-operations) [2019-07-04T13:15:03Z] <godog> reboot ms-be2037 after setting "os control" for power regulator mode - T225713

According to this patch https://patchwork.kernel.org/patch/10530095/ the interface used by pcc-cpufreq isn't scalable with many (>4) CPUs and shouldn't be used. AFAICT that patch isn't included in the kernel we're using. Further down in the page in this message there's this mention of bios settings:

"Dynamic Power Savings Mode" allows pcc-cpufreq to load and "OS
Control Mode" allows intel-pstate to be loaded. We now change it such
that also with "Dynamic Power Savings Mode" intel-pstate is loaded
(if available; if not, pcc-cpufreq will still be loaded but it now
emits a warning and disallows use of ondemand governor if too many
CPUs are in use).

I'm testing this on ms-be2037 after setting "os control" via ilo:

</system1>hpiLO-> show oemhp_power1
                                   
status=0
status_tag=COMMAND COMPLETED
Thu Jul  4 13:11:12 2019
                        


/system1/oemhp_power1
  Targets
  Properties
    oemhp_powerreg=dynamic
    iLO 4 license is required.
    oemhp_PresentPower=262 Watts
    oemhp_power_micro_ver=1.0.9
    oemhp_auto_pwr=Restore
  Verbs
    cd version exit show set

</system1>hpiLO-> cd ..
                       
status=0
status_tag=COMMAND COMPLETED
Thu Jul  4 13:11:52 2019

</>hpiLO-> set /system1/oemhp_power1 oemhp_powerreg=os
                                    
status=0
status_tag=COMMAND COMPLETED
Thu Jul  4 13:12:23 2019

Setting "os control" does indeed disable loading of pcc-cpufreq and governors now are the same as the dell host (i.e. linux drives the CPU p-states autonomously)

ms-be2037:~$ cat /sys/devices/system/cpu/cpufreq/policy0/scaling_available_governors
performance powersave
ms-be2037:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
powersave

After ~20h ms-be2037 of running with "os control" set and powersave governor seems to behave fine. Compared to performance cpu load is slightly higher as expected and temperature slightly lower. dashboard

This comment was removed by fgiunchedi.
hashar added a subscriber: hashar.EditedJul 5 2019, 9:14 AM

I am not sure whether it is related, but a month or so ago I have noticed that the old cloudvirt machines to have poor CPU performance for an unknown reason yet. We have made a benchmark on labtestvirt2003.codfw.wmnet (ProLiant DL360 Gen9) via T225067 which shows that setting the CPU regulator in the bios to minimum dramatically affect the benchmark. But setting it to maximum or dynamic (the default) does not have significant change. I have no idea about the regulator setting at the kernel level.

T223971 is about the old cloudvirt having a very poor CPU performance. The beefy intel xeon they have runs a busy loop in 15seconds compared to 11seconds on my old intel nuc which has a less powerfull CPU. There is a table listing a few benchmark.

It might be just a different issue really, but that might be related as well. So maybe you could use one of the cloudvirt as a candidate for experimenting different cpu scaling governor.

In terms of scaling drivers, here's the list of hosts that don't have intel_pstate (which AIUI is what we want to use)

cumin -p99 -b100 'F:virtual ~ physical' 'cat /sys/devices/system/cpu/cpufreq/policy0/scaling_driver || true'

...
===== NODE GROUP =====
(3) eeden.wikimedia.org,labsdb[1006-1007].eqiad.wmnet
----- OUTPUT of 'cat /sys/devices...g_driver || true' -----
acpi-cpufreq      
===== NODE GROUP =====
(44) bast3002.wikimedia.org,cp1008.wikimedia.org,db2114.codfw.wmnet,db1138.eqiad.wmnet,dbproxy2001.codfw.wmnet,dbpro
xy[1001-1011].eqiad.wmnet,dns1002.wikimedia.org,elastic1041.eqiad.wmnet,es[2001-2004].codfw.wmnet,helium.eqiad.wmnet
,iron.wikimedia.org,labstore[2001-2004].codfw.wmnet,lvs[1001-1006].wikimedia.org,maerlant.wikimedia.org,maps2002.cod
fw.wmnet,mc[1022,1031].eqiad.wmnet,ms-be2033.codfw.wmnet,multatuli.wikimedia.org,nescio.wikimedia.org,rhenium.wikime
dia.org,rhodium.eqiad.wmnet,tungsten.eqiad.wmnet
----- OUTPUT of 'cat /sys/devices...g_driver || true' -----
cat: /sys/devices/system/cpu/cpufreq/policy0/scaling_driver: No such file or directory
===== NODE GROUP =====
(253) aqs[1004-1006].eqiad.wmnet,cloudcontrol2003-dev.wikimedia.org,cloudcontrol[1003-1004].wikimedia.org,clouddb200
1-dev.codfw.wmnet,cloudnet2002-dev.codfw.wmnet,cloudnet[1003-1004].eqiad.wmnet,cloudservices2002-dev.wikimedia.org,c
loudservices1003.wikimedia.org,cloudvirt[1001-1009,1012-1014,1019-1020].eqiad.wmnet,cloudweb2001-dev.wikimedia.org,conf[1004-1006].eqiad.wmnet,db[2034-2038,2040-2063,2065-2070].codfw.wmnet,db[1074-1095].eqiad.wmnet,dbstore[2001-2002].codfw.wmnet,druid[1001-1003].eqiad.wmnet,elastic[2025-2036].codfw.wmnet,elastic[1032-1040,1042-1052].eqiad.wmnet,labmon[1001-1002].eqiad.wmnet,labpuppetmaster[1001-1002].wikimedia.org,labsdb[1009-1011].eqiad.wmnet,labstore[1006-1007].wikimedia.org,labtestpuppetmaster2001.wikimedia.org,labtestservices2003.wikimedia.org,labtestvirt2003.codfw.wmnet,lvs[2001-2006].codfw.wmnet,maps[2001,2003-2004].codfw.wmnet,maps[1001-1004].eqiad.wmnet,mc[2019-2036].codfw.wmnet,mc[1019-1021,1023-1030,1032-1036].eqiad.wmnet,ms-be[2016-2032,2034-2036,2038-2039].codfw.wmnet,ms-be[1016-1039].eqiad.wmnet,mwmaint2001.codfw.wmnet,netmon2001.wikimedia.org,oresrdb2002.codfw.wmnet,rdb[2005-2006].codfw.wmnet,relforge[1001-1002].eqiad.wmnet,restbase2009.codfw.wmnet,restbase[1010-1015].eqiad.wmnet,restbase-dev[1004-1006].eqiad.wmnet,snapshot[1005-1007].eqiad.wmnet,stat1006.eqiad.wmnet,wdqs2003.codfw.wmnet,wdqs1003.eqiad.wmnet,wezen.codfw.wmnet
----- OUTPUT of 'cat /sys/devices...g_driver || true' -----
pcc-cpufreq

I am not sure whether it is related, but a month or so ago I have noticed that the old cloudvirt machines to have poor CPU performance for an unknown reason yet. We have made a benchmark on labtestvirt2003.codfw.wmnet (ProLiant DL360 Gen9) via T225067 which shows that setting the CPU regulator in the bios to minimum dramatically affect the benchmark. But setting it to maximum or dynamic (the default) does not have significant change. I have no idea about the regulator setting at the kernel level.
T223971 is about the old cloudvirt having a very poor CPU performance. The beefy intel xeon they have runs a busy loop in 15seconds compared to 11seconds on my old intel nuc which has a less powerfull CPU. There is a table listing a few benchmark.
It might be just a different issue really, but that might be related as well. So maybe you could use one of the cloudvirt as a candidate for experimenting different cpu scaling governor.

Thanks for pointing to that task, indeed seems related to this investigation! I'll reach out to WMCS folks to coordinate.

Change 520885 had a related patch set uploaded (by Jbond; owner: John Bond):
[operations/puppet@production] facter - cpu_details: add governor fact

https://gerrit.wikimedia.org/r/520885

Change 520885 merged by Jbond:
[operations/puppet@production] facter - cpu_details: add governor and scaling_driver facts

https://gerrit.wikimedia.org/r/520885

CDanis added a subscriber: CDanis.Jul 6 2019, 12:22 AM
fgiunchedi updated the task description. (Show Details)Jul 8 2019, 8:57 AM

Mentioned in SAL (#wikimedia-operations) [2019-07-09T15:22:41Z] <godog> reboot ms-be2023 with oemhp_powerreg=os - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-09T15:27:47Z] <godog> reboot ms-be2024 with oemhp_powerreg=os - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-09T16:29:28Z] <godog> reboot ms-be2025 with oemhp_powerreg=os - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-09T16:42:37Z] <godog> reboot ms-be2026 with oemhp_powerreg=os - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-09T16:54:03Z] <godog> reboot ms-be2027 with oemhp_powerreg=os - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-09T16:59:58Z] <godog> reboot ms-be2039 with oemhp_powerreg=os - T225713

For some reasons I don't seem to be able to set oemhp_powerreg on ms-be2022, I'll try rebooting

</>hpiLO-> show /system1/oemhp_power1
                                     
status=0
status_tag=COMMAND COMPLETED
Wed Jul 10 12:45:14 2019
                        


/system1/oemhp_power1
  Targets
  Properties
    oemhp_powerreg=unavailable
    iLO 4 license is required.
    oemhp_PresentPower=257 Watts
    oemhp_power_micro_ver=1.0.9
    oemhp_auto_pwr=ON (Minimum delay)
  Verbs
    cd version exit show set

Mentioned in SAL (#wikimedia-operations) [2019-07-10T12:49:27Z] <godog> reboot ms-be2022 - T225713

Ok now all codfw row D for ms-be hosts is running with powersave, will leave it like that for a little while, no adverse effects observed so far. If the trend continue I'll do all ms-be hosts in codfw.

I've set oemhp_power1 for all ms-be hosts in codfw now, will start a rolling reboot of those:

ms-be2016 ms-be2017 ms-be2018 ms-be2019 ms-be2020 ms-be2021 ms-be2028 ms-be2029 ms-be2030 ms-be2031 ms-be2032 ms-be2033 ms-be2034 ms-be2035 ms-be2036

Mentioned in SAL (#wikimedia-operations) [2019-07-11T13:40:49Z] <godog> roll restart ms-be2016 ms-be2017 ms-be2018 ms-be2019 ms-be2020 ms-be2021 ms-be2028 ms-be2029 ms-be2030 ms-be2031 ms-be2032 ms-be2033 ms-be2034 ms-be2035 ms-be2036 - T225713

fgiunchedi updated the task description. (Show Details)Jul 11 2019, 2:45 PM
fgiunchedi updated the task description. (Show Details)Jul 11 2019, 2:54 PM

Mentioned in SAL (#wikimedia-operations) [2019-07-11T14:55:45Z] <gehel> setting CPU governor to performance for elastic1052 - T225713

Gehel added a subscriber: Gehel.Jul 11 2019, 2:59 PM

Mentioned in SAL (#wikimedia-operations) [2019-07-11T14:55:45Z] <gehel> setting CPU governor to performance for elastic1052 - T225713

this is just setting the governor to performance via /sys/... Once testing show this works well, I'll go through the full operation (bios + restart).

fgiunchedi updated the task description. (Show Details)Jul 11 2019, 3:06 PM
Gehel added a comment.Jul 11 2019, 3:21 PM

I observe a pretty significant drop in CPU usage on elastic1052 (>50% to ~25%), so that looks good. I'll wait until Monday to apply to the whole cluster.

Mentioned in SAL (#wikimedia-operations) [2019-07-11T15:28:35Z] <gehel> setting CPU governor to performance for wdqs1004 - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-12T18:49:43Z] <gehel> setting CPU governor to performance for wdqs1010 - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-15T08:22:34Z] <godog> set oemhp_powerreg=os on ms-be10[16-39] - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-15T08:48:50Z] <gehel> set oemhp_powerreg=os + reboot for elastic1054 - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-15T08:49:42Z] <gehel> correction: set oemhp_powerreg=os + reboot for elastic1052 (NOT elastic1054) - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-15T12:55:05Z] <gehel> shutting down tilerator on maps eqiad to free some CPU - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-15T12:59:19Z] <gehel> re-enabling kartotherian codfw - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-15T12:59:48Z] <gehel> depooling kartotherian eqiad - T225713

fgiunchedi updated the task description. (Show Details)Jul 15 2019, 3:01 PM

Have you planned the cloudvirt yet? I guess that is a bit more challenging since instances would have to be moved ahead of time, but I am genuinely interested in seeing whether that improves the bad CPU experience I have noticed.

Gehel added a comment.Jul 15 2019, 4:28 PM

Oops, the 3 logs above about maps shoudl have been on T218097

jcrespo updated the task description. (Show Details)Jul 15 2019, 4:44 PM

Mentioned in SAL (#wikimedia-operations) [2019-07-15T16:58:20Z] <jynus> setting labsdb1009/10/11 to performance scaling_governor T225713

FYI, after applying the above change, I expected a huge shift on reported load (even if performance didn't change) or on temperatures, given this (wikireplicas on labs) are our busiest databases on cpu resources due to long-running queries, however, I didn't see much difference, unlike other reporters, except on the temperatures of labsdb1011, none on the load or the temperatures of the others. Maybe CPU was already a problem in scaling for database load or something else? https://grafana.wikimedia.org/d/000000607/cluster-overview?orgId=1&from=1563087571551&to=1563260371552&var-datasource=eqiad%20prometheus%2Fops&var-cluster=mysql&var-instance=labsdb1009&var-instance=labsdb1010&var-instance=labsdb1011

Andrew updated the task description. (Show Details)Jul 16 2019, 3:44 PM

Mentioned in SAL (#wikimedia-operations) [2019-07-16T18:02:51Z] <andrewbogott> rebooting cloudcontrol2003-dev, cloudweb2001-dev, cloudcontrol1004 for T225713

Andrew updated the task description. (Show Details)Jul 16 2019, 6:04 PM
Andrew updated the task description. (Show Details)Jul 16 2019, 6:19 PM

Mentioned in SAL (#wikimedia-operations) [2019-07-17T10:30:55Z] <godog> start rolling reboot of ms-be eqiad hosts - T225713

Mentioned in SAL (#wikimedia-operations) [2019-07-18T09:09:08Z] <godog> resume swift ms-be rolling restarts - T225713

fgiunchedi updated the task description. (Show Details)Jul 18 2019, 10:26 AM

Mentioned in SAL (#wikimedia-operations) [2019-07-18T10:29:12Z] <godog> reboot wezen.codfw.wmnet - T225713

fgiunchedi updated the task description. (Show Details)Jul 18 2019, 10:33 AM
fgiunchedi added a comment.EditedJul 19 2019, 9:38 AM

FYI, after applying the above change, I expected a huge shift on reported load (even if performance didn't change) or on temperatures, given this (wikireplicas on labs) are our busiest databases on cpu resources due to long-running queries, however, I didn't see much difference, unlike other reporters, except on the temperatures of labsdb1011, none on the load or the temperatures of the others. Maybe CPU was already a problem in scaling for database load or something else? https://grafana.wikimedia.org/d/000000607/cluster-overview?orgId=1&from=1563087571551&to=1563260371552&var-datasource=eqiad%20prometheus%2Fops&var-cluster=mysql&var-instance=labsdb1009&var-instance=labsdb1010&var-instance=labsdb1011

It is indeed possible that the CPU was already very utilized, I can see a small decrease in system CPU % but other than that things seems unchanged. I'm curious to know/see what powersave will do at the next reboot!

Mentioned in SAL (#wikimedia-operations) [2019-07-26T13:41:43Z] <jeh> updated labstore100[67].wikimedia.org performance scaling_governor T225713

JHedden updated the task description. (Show Details)Jul 26 2019, 1:42 PM

FYI, after applying the above change, I expected a huge shift on reported load (even if performance didn't change) or on temperatures, given this (wikireplicas on labs) are our busiest databases on cpu resources due to long-running queries, however, I didn't see much difference, unlike other reporters, except on the temperatures of labsdb1011, none on the load or the temperatures of the others. Maybe CPU was already a problem in scaling for database load or something else? https://grafana.wikimedia.org/d/000000607/cluster-overview?orgId=1&from=1563087571551&to=1563260371552&var-datasource=eqiad%20prometheus%2Fops&var-cluster=mysql&var-instance=labsdb1009&var-instance=labsdb1010&var-instance=labsdb1011

I did the same test during the offsite in Dublin with labsdb1009, and also didn't see any major changes.

Gehel added a comment.Aug 12 2019, 1:15 PM

elastic[1032-1052].eqiad.wmnet,elastic[2025-2036].codfw.wmnet have been configured with set /system1/oemhp_power1 oemhp_powerreg=os. This will take effect after next rolling restart.

fgiunchedi moved this task from Doing to Radar on the User-fgiunchedi board.Aug 13 2019, 1:16 PM
Gehel updated the task description. (Show Details)Aug 15 2019, 2:18 PM
Gehel updated the task description. (Show Details)Aug 15 2019, 7:30 PM

+ cloud-services-team for the hosts: cloudvirt[1001-1009,1012-1013,1019-1020].eqiad.wmnet

cloudvirt1014 has already been updated and cloudvirt1013 has the same CPU. From T223971, testing with a busy loop: time $(i=1; while (( i < 2000000 )); do (( i ++ )); done):

Hostmw2139cloudvirt1006cloudvirt1013cloudvirt1014 (updated)
CPUE5-2450E5-2697 v2E5-2697 v3E5-2697 v3
Speed2.10 GHz2.70 GHz2.60GHz2.60GHz
Turbo2.9 GHz3.5 GHz3.6GHz3.6GHz
Time10s17s12s7s

If this task audit hold true, when doing the change to cloudvirt1013 it should be faster as a result and the setting should be applied on all the other affected cloudvirt hosts.