Page MenuHomePhabricator

Replace disk on mw1230
Closed, ResolvedPublic

Description

Please replace the /dev/sda disk on mw1230, as SMART is reporting it as failed.

mw1230:~# smartctl -H /dev/sda
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
Failed Attributes:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   001   001   051    Pre-fail  Always   FAILING_NOW 8127

I will depool the server, you can then just power it off, swap the disk with a new one, and start it again. The raid will rebiuld itself.

Event Timeline

Joe created this task.Jun 11 2018, 10:28 AM
Restricted Application added a project: Operations. · View Herald TranscriptJun 11 2018, 10:28 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Mentioned in SAL (#wikimedia-operations) [2018-06-11T10:29:06Z] <_joe_> depooling permantently mw1230 for disk replacement, T196881

@Joe mw1230 disks replaced, needs reinstall

Script wmf-auto-reimage was launched by oblivian on neodymium.eqiad.wmnet for hosts:

mw1230.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/201806111557_oblivian_2100_mw1230_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['mw1230.eqiad.wmnet']

Of which those FAILED:

['mw1230.eqiad.wmnet']

Script wmf-auto-reimage was launched by oblivian on neodymium.eqiad.wmnet for hosts:

mw1230.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/201806120546_oblivian_6165_mw1230_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['mw1230.eqiad.wmnet']

Of which those FAILED:

['mw1230.eqiad.wmnet']

Script wmf-auto-reimage was launched by oblivian on neodymium.eqiad.wmnet for hosts:

mw1230.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/201806120547_oblivian_6324_mw1230_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['mw1230.eqiad.wmnet']

Of which those FAILED:

['mw1230.eqiad.wmnet']

Script wmf-auto-reimage was launched by oblivian on neodymium.eqiad.wmnet for hosts:

mw1230.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/201806120548_oblivian_6468_mw1230_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['mw1230.eqiad.wmnet']

Of which those FAILED:

['mw1230.eqiad.wmnet']

Script wmf-auto-reimage was launched by oblivian on neodymium.eqiad.wmnet for hosts:

mw1230.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/201806120549_oblivian_6643_mw1230_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['mw1230.eqiad.wmnet']

Of which those FAILED:

['mw1230.eqiad.wmnet']

Mentioned in SAL (#wikimedia-operations) [2018-06-12T12:34:06Z] <_joe_> repooling mw1230 after reimaging T196881

Joe closed this task as Resolved.Jun 12 2018, 12:34 PM
Joe claimed this task.
Vvjjkkii renamed this task from Replace disk on mw1230 to eabaaaaaaa.Jul 1 2018, 1:05 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed Joe as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from eabaaaaaaa to Replace disk on mw1230.Jul 2 2018, 2:56 PM
CommunityTechBot closed this task as Resolved.
CommunityTechBot assigned this task to Joe.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.