Page MenuHomePhabricator

Survey and correct issues caused by CODFW switch failure
Closed, ResolvedPublic

Description

We lost a switch in CODFW today (T327001). Creating this ticket to survey/correct any related problems with Search Platform-owned services, including but not limited to:

  • Reindex Elastic indices in CODFW
  • Verify Elastic clusters are healthy CODFW (sans the 4 red indices we will correct separately)
  • Verify WDQS streaming updater is healthy
  • Repool WDQS clusters (internal + external) in CODFW

Related Objects

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2023-01-17T19:50:15Z] <ryankemper> T327175 Reprocessing last several hours of updates (2023-01-17T12:00:00Z -> 2023-01-17T17:30:00Z) on codfw elasticsearch, running on ryankemper@mwmaint2002 tmux session reindex

Gehel claimed this task.