Page MenuHomePhabricator

cp3065 crashed
Closed, ResolvedPublic

Description

cp3065 went down yesterday 2019-11-11 at 21:54:24 showing the same symptoms as described in T237348 for cp3057.

The server has been power-cycled by @ema and is currently reachable but depooled.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptTue, Nov 12, 2:29 AM
ema triaged this task as Medium priority.Tue, Nov 12, 10:05 AM

Mentioned in SAL (#wikimedia-operations) [2019-11-12T10:06:59Z] <ema> repool cp3065, nothing interesting in kern.log and SEL T238032

ema moved this task from Triage to Hardware on the Traffic board.Tue, Nov 12, 2:45 PM
ema added a comment.Tue, Nov 12, 3:28 PM

Perhaps interestingly, or maybe entirely unrelated: a couple of hours before crashing the host had a spike in cache write errors:

Vgutierrez closed this task as Resolved.Thu, Nov 14, 11:32 AM
Vgutierrez claimed this task.

Tracking the issue on the parent task: T238305