Page MenuHomePhabricator

Move toolsdb and wikilabels cluster servers for datacenter reconfiguration
Closed, ResolvedPublic

Description

Per T187962, the toolsdb and wikilabels servers (labsdb1005/labsdb1004) are moving. This task to coordinate the considerable user impact likely for services that depend on toolsdb (since during a failover, some tables don't https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#ToolsDB_Backups_and_Replication). wikilabels is not replicated and will thus take an outage during this move.

Proposed dates are July 10th for labsdb1004 and July 11th for labsdb1005.

Event Timeline

Bstorm created this task.Jun 14 2018, 4:07 PM
Restricted Application added a project: Scoring-platform-team. · View Herald TranscriptJun 14 2018, 4:07 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

For what is worth: in production the impact was just a few seconds of unavailability, less than 10 seconds even for the primary db master that we did (db1061)

For what is worth: in production the impact was just a few seconds of unavailability, less than 10 seconds even for the primary db master that we did (db1061)

Great!

Same as T197246, should we take the chance to upgrade to stretch/mariaDB 10.1? If not, when?

Ladsgroup added a subscriber: Ladsgroup.

I don't think this even needs announcement for wikilabels.

Vvjjkkii renamed this task from Move toolsdb and wikilabels cluster servers for datacenter reconfiguration to a0aaaaaaaa.Jul 1 2018, 1:04 AM
Vvjjkkii removed Bstorm as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
Marostegui renamed this task from a0aaaaaaaa to Move toolsdb and wikilabels cluster servers for datacenter reconfiguration.Jul 1 2018, 8:10 PM
Marostegui assigned this task to Bstorm.
Marostegui lowered the priority of this task from High to Normal.
Marostegui updated the task description. (Show Details)
Bstorm added a comment.Jul 5 2018, 6:26 PM

@jcrespo -- With some issues around the RAID still giving me trouble, we could perhaps do that stretch upgrade when we move to VMs. Otherwise, would that draw out the service impact a lot? Databases would need to come down during it, I presume, and perhaps we can do that.

CommunityTechBot raised the priority of this task from Normal to Needs Triage.Jul 5 2018, 7:02 PM
Harej moved this task from Backlog to Radar on the Wikilabels board.Jul 9 2018, 5:15 PM

labsdb1004 is moved, tomorrow will be 1005.

This is done.

Bstorm closed this task as Resolved.Jul 11 2018, 3:47 PM