Page MenuHomePhabricator

New analytic hosts with BBU learning cycle enabled
Closed, ResolvedPublic3 Story Points

Description

Hello,

Today analytics1062 and 1063 (and possibly 1067) started a learning cycle on their BBU, meaning that it goes from WB to WT (which will probably affect performance).

root@analytics1062:~# megacli -AdpBbuCmd -aAll -nolog | tail -n9

BBU Properties for Adapter: 0

  Auto Learn Period: 90 Days
  Next Learn time: Mon Sep 11 15:27:48 2017
  Learn Delay Interval:0 Hours
  Auto-Learn Mode: Transparent
root@analytics1063:~# megacli -AdpBbuCmd -aAll -nolog

BBU status for Adapter: 0

BatteryType: BBU
Voltage: 3766 mV
Current: -95 mA
Temperature: 51 C
Battery State: Optimal
BBU Firmware Status:

  Charging Status              : Discharging
  Voltage                                 : OK
  Temperature                             : OK
  Learn Cycle Requested	                  : Yes
  Learn Cycle Active                      : Yes
  Learn Cycle Status                      : OK

analytics1067 reported error (T167797), but it might have been temporarily during the battery discharge+charge process, but it does have the auto-learn enabled as the other ones

root@analytics1067:~# megacli -AdpBbuCmd -aAll -nolog | tail -n9

BBU Properties for Adapter: 0

  Auto Learn Period: 90 Days
  Next Learn time: Mon Sep 11 13:19:54 2017
  Learn Delay Interval:0 Hours
  Auto-Learn Mode: Transparent

The main goal of this task to let Analytics know about this situation, it might be okay for them. If so, feel free to close the task!

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 13 2017, 3:40 PM
Marostegui triaged this task as Medium priority.Jun 13 2017, 3:40 PM
Marostegui updated the task description. (Show Details)
elukey added a subscriber: Ottomata.
elukey added a subscriber: elukey.
Nuria edited projects, added Analytics-Kanban; removed Analytics.Jun 15 2017, 3:50 PM
Nuria added a subscriber: Nuria.

Puting on kanban for @elukey to look at

Mentioned in SAL (#wikimedia-operations) [2017-06-19T12:04:42Z] <elukey> run 'echo "autoLearnMode=1" > /tmp/disable_learn && megacli -AdpBbuCmd -SetBbuProperties -f /tmp/disable_learn -a0' on all the analytics workers to disable BBU Auto learn - T167809

Updated https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration with instructions on how to set the BBU auto-learn functionality.

elukey set the point value for this task to 3.Jun 19 2017, 12:17 PM
elukey moved this task from Next Up to Done on the Analytics-Kanban board.
elukey@neodymium:~$ sudo cumin 'R:class = role::analytics_cluster::hadoop::worker' 'megacli -AdpBbuCmd -GetBbuProperties -aALL -nolog | grep "Auto-Learn Mode"'

===== NODE GROUP =====
(42) analytics[1028-1069].eqiad.wmnet
----- OUTPUT of 'megacli -AdpBbuC...Auto-Learn Mode"' -----
  Auto-Learn Mode: Disabled
elukey moved this task from Backlog to Done on the User-Elukey board.Jun 23 2017, 10:23 AM
Nuria closed this task as Resolved.Jun 27 2017, 3:43 PM