Page MenuHomePhabricator

clouddb1002 low on space -- move wikilabelsdb
Closed, ResolvedPublic0 Estimated Story Points


Because the wikilabels DB is shared with the toolsdb secondary server, a setup inherited from the original physical pair of servers, the volume is not large enough to keep up with the primary anymore. The toolsdb primary is at 60%, but the split volume of the instance housing the replica is 86%.

The sensible thing to do for wikilabels and everything else is to separate the wikilabels DB onto its own instance (and ideally add a secondary instance to replicate it to).

Event Timeline

Bstorm triaged this task as High priority.May 21 2019, 9:09 PM
Bstorm created this task.

I built ssh clouddb-wikilabels-01.clouddb-services.eqiad.wmflabs and ssh clouddb-wikilabels-02.clouddb-services.eqiad.wmflabs on cloudvirt1028 and cloudvirt1029. They're huge but at the moment there's plenty of room to grow on those hosts.

Bstorm added a subscriber: Halfak.

Adding wikilabels tag just as a heads up. When we have a location set up, I'll start this replicating to the new locations and eventually change the primary. The actual changes should have very little service impact since I'll be moving the DNS alias, but I'll make sure and coordinate with @Halfak so things can be restarted if needed, etc. when the time comes.

Change 511786 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikilabels: add secondary db role for replication

Change 511786 merged by Bstorm:
[operations/puppet@production] wikilabels: add secondary db role for replication

Mentioned in SAL (#wikimedia-cloud) [2019-05-22T00:14:45Z] <bstorm_> T224062 wikilabels postgres is now replicating to clouddb-wikilabels-01

It should be safe to swap it over at any time now. The only quirk is that it would be read-only for a bit after DNS jumps over to the secondary with a possible blip when promoting that to primary. @Halfak I just need to know when that might be ok (like when someone is around to catch the service if it falls). Hopefully that's tomorrow because I really want to reclaim this space for toolsdb soon (not that wikilabels is big or anything, rather I need the volume it's on for the big database next door). This will turn the wikilabels db into something on a separate, replicated pair of its own.

Hi @Bstorm, I just got back from the Wikimedia Hackathon and I'm catching up on other things. I don't think we can schedule maintenance and make the switch today. Friday seems more likely. Could that work?

@Halfak if you are cool with deploy-on-friday, I'll do it! :)
The replica is replicating now, so it should be pretty smooth. You'll be read-only for a few when DNS is shifted over to the new server, and then it'll go r/w as soon as I've got the promotion command in there. Since I did the procedure for OSMdb to migrate that much bigger DB, this should be easy :)

I'll put up the patch to change the DNS alias so it is ready.

Note to self: downtime the toolschecker for this before starting since it expects r/w on the db for its checks.

The new server, FYI is clouddb-wikilabels-01.clouddb-services.eqiad.wmflabs, so you can validate that you can make a read-only connection at any time if you are so inclined. I'll move DNS to that and then promote it to primary.

Change 512406 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikilabels: move wikilabels DB to its own server

Change 512406 merged by Bstorm:
[operations/puppet@production] wikilabels: move wikilabels DB to its own server

Change 512428 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikilabels: change DNS to a new server

Change 512428 merged by Bstorm:
[operations/puppet@production] wikilabels: change DNS to a new server

Ok, the database is moved to a new location. I have some docs to update, things to clean up on clouddb1002 and a replica to stand up.

Mentioned in SAL (#wikimedia-cloud) [2019-05-24T21:00:39Z] <bstorm_> T224062 Moved wikilabels postgres db to clouddb-wikilabels-01

Mentioned in SAL (#wikimedia-cloud) [2019-05-24T23:38:04Z] <bstorm_> T224062 clouddb1002 is now free of postgresql services and the volume is reclaimed for toolsdb use

clouddb1002 is at 44% after cleanup and moving wikilabels.