Page MenuHomePhabricator

db1089: update RAID controller firwmare
Closed, ResolvedPublic

Description

Hello Chris,

db1089 had a controller issue (T166933) and during the startup it said:

Slot 1 Port 1 : Smart Array P840 Controller - (4096 MB, V3.56) 1 Logical
Drive(s) - Operation Failed
 - 1719-Slot 1 Drive Array - A controller failure event occurred prior
   to this power-up.  (Previous lock up code = 0x13) Action: Install the
   latest controller firmware. If the problem persists, replace the
   controller.


Important information available or errors detected
 Press 'ESC+1' to continue, or 'ESC+2' for more information

It will be good to upgrade its firmware to the latest, so if it crashes again we can at least say to HP that the firmware was upgraded after the first crash.

Details

Related Gerrit Patches:
operations/mediawiki-config : masterdb-eqiad.php: Depool db1089
operations/mediawiki-config : masterdb-eqiad.php: Depool db1089
operations/mediawiki-config : masterdb-eqiad.php: Add comment about db1089 status

Event Timeline

Restricted Application added a project: Operations. · View Herald TranscriptJun 3 2017, 5:53 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Marostegui triaged this task as High priority.Jun 3 2017, 5:53 AM
Marostegui moved this task from Triage to Blocked external/Not db team on the DBA board.
root@db1089:~# hpssacli controller all show detail | grep Firmware
   Firmware Version: 3.56

I have started to slowly repool this server as I don't want to leave it out much longer.
@Cmjohnson once you have time for the firmware upgrade, let us know and we will depool it.
Thanks!

Mentioned in SAL (#wikimedia-operations) [2017-06-05T14:45:12Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1089 with low weight - T166935 (duration: 00m 39s)

Mentioned in SAL (#wikimedia-operations) [2017-06-05T15:03:04Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Increase db1089 weight - T166935 (duration: 00m 38s)

Mentioned in SAL (#wikimedia-operations) [2017-06-05T15:16:20Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Increase db1089 weight - T166935 (duration: 00m 39s)

Mentioned in SAL (#wikimedia-operations) [2017-06-05T16:25:02Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Restore db1089 original weight - T166935 (duration: 00m 38s)

Change 357341 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Add comment about db1089 status

https://gerrit.wikimedia.org/r/357341

Change 357341 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Add comment about db1089 status

https://gerrit.wikimedia.org/r/357341

Mentioned in SAL (#wikimedia-operations) [2017-06-06T06:40:08Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Add comments about current status of db1089 - T166935 (duration: 00m 39s)

@Cmjohnson you think you will have time for this sometime next week? Thanks!

As per my chat with @Cmjohnson on Friday this will be done on today (Monday)

Change 358319 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1089

https://gerrit.wikimedia.org/r/358319

Change 358319 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1089

https://gerrit.wikimedia.org/r/358319

Mentioned in SAL (#wikimedia-operations) [2017-06-12T12:01:36Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1089 for maintenance - T166935 (duration: 00m 41s)

Mentioned in SAL (#wikimedia-operations) [2017-06-12T13:53:07Z] <marostegui> Shutdown db1089 for maintenance - T166935

Change 358518 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1089

https://gerrit.wikimedia.org/r/358518

Change 358518 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1089

https://gerrit.wikimedia.org/r/358518

Mentioned in SAL (#wikimedia-operations) [2017-06-13T06:43:23Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1089 - T166935 (duration: 00m 42s)

Mentioned in SAL (#wikimedia-operations) [2017-06-13T06:43:48Z] <marostegui> Stop MySQL on db1089 to upgrade its raid controller firmware - T166935

Marostegui closed this task as Resolved.Jun 13 2017, 6:58 AM
Marostegui claimed this task.
Marostegui added a subscriber: Cmjohnson.

As Chris was having issues yesterday with the HP bundle, we decided that I would try to upgrade the firmware from the OS today, and I have done so:

root@db1089:~# hpssacli controller all show detail | grep Firmware
   Firmware Version: 5.04

Mentioned in SAL (#wikimedia-operations) [2017-06-13T08:38:51Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Restore db1089 original weight - T166935 (duration: 00m 42s)