Page MenuHomePhabricator

Please optimize image table in commonswiki
Open, MediumPublic

Description

The script just finished.

Progress

  • eqiad
  • codfw

Event Timeline

Marostegui triaged this task as Medium priority.
Marostegui moved this task from Triage to Ready on the DBA board.

<3

Mentioned in SAL (#wikimedia-operations) [2021-08-06T05:44:35Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1160 T288273', diff saved to https://phabricator.wikimedia.org/P16965 and previous config saved to /var/cache/conftool/dbconfig/20210806-054433-marostegui.json

Change 710422 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1160: Disable notifications

https://gerrit.wikimedia.org/r/710422

Change 710422 merged by Marostegui:

[operations/puppet@production] db1160: Disable notifications

https://gerrit.wikimedia.org/r/710422

Going to optimize a host in eqiad to see what's the reduction in space and how long it takes for the optimize to run.
Current size (compressed):

root@db1160:/srv/sqldata/commonswiki# ls -lh image.ibd
-rw-rw---- 1 mysql mysql 361G Aug  6 05:46 image.ibd

Mentioned in SAL (#wikimedia-operations) [2021-08-06T05:47:31Z] <marostegui> Optimize commonswiki.image on db1160 T288273

It took around 5 hours for the optimize to finish and this is the new size:

root@db1160:/srv/sqldata/commonswiki# ls -lh image.ibd
-rw-rw---- 1 mysql mysql 79G Aug  6 10:32 image.ibd

This is an awesome reduction thanks for all the work @Ladsgroup!!

On Monday I will deploy the change to s4 eqiad with replication enabled.

Hey, @Marostegui, sorry to ask you this, but out of being super-cautious, may I ask you to, before applying it on Monday, either:

  • Check that the eqiad s4 snapshot has finished by the time you start the backup. It should say something like taken on 2021-08-08 2X:XX:XX or taken on 2021-08-09 XX:XX:XX (the night before)

or

  • Stop replication on db1139:s4 (so I am not a blocker) and I can later resume it when I make sure it is idle

Either of them would be fine.

Nothing should normally happen and this is likely not needed, but for monster tables like image, I think all precautions are worth it (e.g. to avoid having a long running alter conflicting with a long running backup or dump). A backup dashboard would make things easier in the long term 0:-).

Will do! Thanks for the heads up!

@jcrespo eqiad s4 snapshot is done - codfw is running, but that one isn't affected as the change will only be deployed in eqiad.
I am going to go ahead and run it on the master with replication.

Mentioned in SAL (#wikimedia-operations) [2021-08-09T05:22:58Z] <marostegui> Optimize commonswiki.image on eqiad, lag will appear - T288273

Mentioned in SAL (#wikimedia-operations) [2021-08-09T05:23:28Z] <marostegui> Lag in s4 (commonswiki) will appear on clouddb* hosts (wiki replicas) T288273

Mentioned in SAL (#wikimedia-operations) [2021-08-09T07:52:13Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1160 T288273', diff saved to https://phabricator.wikimedia.org/P16971 and previous config saved to /var/cache/conftool/dbconfig/20210809-075212-marostegui.json

eqiad is done (still being processed on clouddb* hosts), once we switch back, we can finish up codfw!

Mentioned in SAL (#wikimedia-operations) [2021-08-26T06:46:55Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1160 T288273', diff saved to https://phabricator.wikimedia.org/P17085 and previous config saved to /var/cache/conftool/dbconfig/20210826-064655-marostegui.json