Page MenuHomePhabricator

Create less overhead on bacula jobs when dumping production databases
Closed, ResolvedPublic

Description

Potentially this means:

  • Bacula should only recollect already processed files, and not do dumps synchronously
  • A cron should do the dumps in advance
  • Some kind of syncronization should be done to abort still in progress dumps or backup only completed jobs
  • Extra monitoring should be in place when something goes wrong
  • Other improvements could be done at the same time (see sister tickets)

Event Timeline

jcrespo renamed this task from Create less overhead on bacula jobs to Create less overhead on bacula jobs when dumping production databases.Apr 12 2017, 10:34 AM
jcrespo created this task.

For context: fixing this would also alleviate a current problem where long-running backup jobs stall both other backup jobs and restore jobs as well (e.g. dbstore1001 backup job takes several hours to complete ATM)

I am testing mydumper (and myloader) on db1015. That is one of our slowest hosts, although s3 should be favorable for an efficient dump. Taking a full s3 backup took 1h15m, recovering is still in process, but it will take around 14 hours https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&orgId=1&var-server=db1015&var-network=eth0&from=1493710954434&to=1493788684000 . It is a lot, but it is a logical remote restoration- it should be faster locally and on a larger server.

The main advantage is the flexibility- we can recover single tables or even rows. Also, the size is very compact because even if the same compression method is used, a logical backup skips extra space for the indexes- s3 only takes 88GB.

Looks like db1015 ran out of space in the middle of the night: check emails from icinga and: https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&panelId=17&fullscreen&orgId=1&from=now-12h&to=now&var-server=db1015&var-network=eth0

[06:31:34] marostegui@db1015:~$ df -hT /srv/
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   1.6T  1.6T   28K 100% /srv

I will not touch it as it is depooled and I don't want to mess up your tests :)

I am recovering db1015 again minus cebwiki. Better that leaving db1015 broken and doing nothing. I have disabled notifications on db1015 just in case.

Creating a full backup of enwiki from db2071 to dbstore2001 (despite the extra load for the ongoing alter) took 53m28.379s, generating a 93G set of compressed files. That would reduce the backup time on all shards from >15-24h to 8 hours (although more load would be created with local backups.

The next step is to puppetize a cron + script to emulate our current backup policy and deploy it on dbstore2001 only (independent of bacula). This is ok not just as a test, but to have backups being create on both dcs (avoiding costly inter-dc transferences and creating redundancy).

jcrespo moved this task from Triage to In progress on the DBA board.

It took a bit more than 11 hours to reload logically db1015 (minus cebwiki) - that is 1.3 TB (out of a total of 1.5TB for all of s3). It is now back replicating but with:

Replicate_Wild_Ignore_Table: cebwiki.%

@jcrespo if you want to test with s4 feel free to pick any host from: T161088 (only db1081 was done)

We need to test myloader using compressed tables, as during T153743 it took almost 24 hours to load:

root@labsdb1009:/srv/tmp/dewiki# du -sh *
37G	 export-20170608-064122

Those 37G once loaded would be around 140G.

It was exported using mydumper, from a host that was running Barracuda+ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8

To load it (on labsdb1009 and labsdb1010) I used:

myloader -h localhost -t 8 -u root -S /tmp/mysql.sock -d

I tried using Barracuda on labsdb1009 and Antelope on labsdb1010 host to load wikidatawiki (around 400G) (so it avoids the COMPRESSED and goes for COMPACT) and it did make a difference.

Antelope host: 12 hours to load 282G
Barracuda host: 12 hours to load 45G

The commands to load the data were the same that I used to load the data on db1095 which obviously never took this long.

Checking the RAID controller settings on db1095 and labsdb1009 reveal no differences on it.

I am loading the same dataset on labsdb1011 but without using compression (I have edited the table definition schemas to remove the compression part - to see if it has any effect on the speed of dewiki)

Looks like not having compression makes a big difference.
On labsdb1011: 139G in 5 hours which is way faster than the other two hosts with dewiki (which are still importing it - more than 24 hours to do 152G)

It is inserting around 6k rows every 10 seconds for the revision dewiki table:

root@labsdb1009:/srv/sqldata/dewiki# strace -p 27245 -tT -f -s100000 -o /tmp/strace.out^C
root@labsdb1009:/srv/sqldata/dewiki# ls -lh /tmp/strace.out
-rw-r--r-- 1 root root 9.8M Jun  9 19:50 /tmp/strace.out
root@labsdb1009:/srv/sqldata/dewiki# cat /tmp/strace.out | grep Records
27253 19:49:07 <... read resumed> ";\0\0\1\0\374\203\26\376af!\t\0\0\0\0\1\0\0\0)Records: 5763  Duplicates: 0  Warnings: 0", 16384) = 63
27253 19:49:07 read(10, ";\0\0\1\0\374\316\26\376@~!\t\0\0\0\0\1\10\0\0)Records: 5838  Duplicates: 0  Warnings: 0", 16384) = 63
27253 19:49:18 read(10, ";\0\0\1\0\374\22\27\376\214\226!\t\0\0\0\0\1\10\0\0)Records: 5906  Duplicates: 0  Warnings: 0", 16384) = 63
27253 19:49:29 read(10, ";\0\0\1\0\374\207\27\376\376\256!\t\0\0\0\0\1\10\0\0)Records: 6023  Duplicates: 0  Warnings: 0", 16384) = 63
27253 19:49:40 read(10, ";\0\0\1\0\374\336\26\376\5\306!\t\0\0\0\0\1\10\0\0)Records: 5854  Duplicates: 0  Warnings: 0", 16384) = 63
27253 19:49:51 read(10, ";\0\0\1\0\374\301\27\376\342\336!\t\0\0\0\0\1\10\0\0)Records: 6081  Duplicates: 0  Warnings: 0", 16384) = 63
27253 19:50:03 read(10, ";\0\0\1\0\374\322\25\376\212\365!\t\0\0\0\0\1\10\0\0)Records: 5586  Duplicates: 0  Warnings: 0", 16384) = 63
27253 19:50:15 read(10, ";\0\0\1\0\374\2\27\376\227\r\"\t\0\0\0\0\1\0\0\0)Records: 5890  Duplicates: 0  Warnings: 0", 16384) = 63
27253 19:50:24 read(10, ";\0\0\1\0\374-\27\376\31&\"\t\0\0\0\0\1\0\0\0)Records: 5933  Duplicates: 0  Warnings: 0", 16384) = 63

I realised that the revision table on db1095 is around 150M rows, and the original mydumper command would split tables if longer than 100M rows, so in this case, only two files, so it would have not taken much advantage of the multiple threads for this table.
Maybe next time we should get more files rather than bigger files so it can use the multiple threads better.

Part of the work (the skeleton) has been done T169516, but that is far frome perfect, with more FIXMES than good code. That should be now fixed and coverted into better code/functionality cc @Marostegui @Volans @akosiaris.

Clarification, the CC is not intended as a "you should do that", but as an apology for the current state, and that it will be done better soon. Thanks to everybody.

Change 382144 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mydumper: Expand saved projects to all local datasets

https://gerrit.wikimedia.org/r/382144

Change 382144 merged by Jcrespo:
[operations/puppet@production] mydumper: Expand saved projects to all local datasets

https://gerrit.wikimedia.org/r/382144

Backups of local hosts worked, but x1 only backed up %wik% databases, we have to get a full list of databases there and add it to dump grants, or provide grants on *.*.

To try to solve the previous issue, the following grants have been executed:

root@dbstore2001> set sql_log_bin=0; GRANT SELECT, LOCK TABLES, SHOW VIEW, EVENT, TRIGGER ON `flowdb`.* TO 'dump'@'localhost'; 
root@dbstore2001> GRANT SELECT, LOCK TABLES, SHOW VIEW, EVENT, TRIGGER ON `wikishared`.* TO 'dump'@'localhost';
root@dbstore2001> GRANT SELECT, LOCK TABLES, SHOW VIEW, EVENT, TRIGGER ON `cognate_wiktionary`.* TO 'dump'@'localhost';

I think those 3 are the only databases that are not part of all.dblist (even if 2 of them contain %wik% already). That should solve the issues of not having a full backup on dbstore2001. Not changed dbstore2002 because we do not know how that will end up (remote or local permissions).

Latest run does flow and others too, correctly:

root@dbstore2001:/srv/backups/x1.20171017220041$ ls -la | grep flowdb
-rw-r--r--  1 dump dump        249 Oct 18 14:07 flowdb.flow_definition-schema.sql.gz
-rw-r--r--  1 dump dump        553 Oct 18 13:45 flowdb.flow_definition.sql.gz
-rw-r--r--  1 dump dump        319 Oct 18 14:07 flowdb.flow_ext_ref-schema.sql.gz
-rw-r--r--  1 dump dump    5563578 Oct 18 13:45 flowdb.flow_ext_ref.sql.gz
-rw-r--r--  1 dump dump        442 Oct 18 14:07 flowdb.flow_revision-schema.sql.gz
-rw-r--r--  1 dump dump   17656454 Oct 18 13:46 flowdb.flow_revision.sql.gz
-rw-r--r--  1 dump dump        301 Oct 18 14:07 flowdb.flow_subscription-schema.sql.gz
-rw-r--r--  1 dump dump        212 Oct 18 14:07 flowdb.flow_topic_list-schema.sql.gz
-rw-r--r--  1 dump dump    1126757 Oct 18 13:45 flowdb.flow_topic_list.sql.gz
-rw-r--r--  1 dump dump        250 Oct 18 14:07 flowdb.flow_tree_node-schema.sql.gz
-rw-r--r--  1 dump dump    6540385 Oct 18 13:45 flowdb.flow_tree_node.sql.gz
-rw-r--r--  1 dump dump        273 Oct 18 14:07 flowdb.flow_tree_revision-schema.sql.gz
-rw-r--r--  1 dump dump    8491048 Oct 18 13:46 flowdb.flow_tree_revision.sql.gz
-rw-r--r--  1 dump dump        322 Oct 18 14:07 flowdb.flow_wiki_ref-schema.sql.gz
-rw-r--r--  1 dump dump    7747542 Oct 18 13:47 flowdb.flow_wiki_ref.sql.gz
-rw-r--r--  1 dump dump        363 Oct 18 14:07 flowdb.flow_workflow-schema.sql.gz
-rw-r--r--  1 dump dump    3395872 Oct 18 13:45 flowdb.flow_workflow.sql.gz
-rw-r--r--  1 dump dump         87 Oct 18 13:33 flowdb-schema-create.sql.gz
root@dbstore2001:/srv/backups/x1.20171017220041$ ls -la | grep wikishared
-rw-r--r--  1 dump dump        282 Oct 18 14:07 wikishared.bounce_records-schema.sql.gz
-rw-r--r--  1 dump dump     165060 Oct 18 14:05 wikishared.bounce_records.sql.gz
-rw-r--r--  1 dump dump        319 Oct 18 14:07 wikishared.cx_corpora-schema.sql.gz
-rw-r--r--  1 dump dump 1789353984 Oct 18 14:36 wikishared.cx_corpora.sql.gz
-rw-r--r--  1 dump dump        313 Oct 18 14:07 wikishared.cx_lists-schema.sql.gz
-rw-r--r--  1 dump dump      46449 Oct 18 14:05 wikishared.cx_lists.sql.gz
-rw-r--r--  1 dump dump        241 Oct 18 14:07 wikishared.cx_suggestions-schema.sql.gz
-rw-r--r--  1 dump dump    4404354 Oct 18 14:06 wikishared.cx_suggestions.sql.gz
-rw-r--r--  1 dump dump        427 Oct 18 14:07 wikishared.cx_translations-schema.sql.gz
-rw-r--r--  1 dump dump   32543822 Oct 18 14:09 wikishared.cx_translations.sql.gz
-rw-r--r--  1 dump dump        204 Oct 18 14:07 wikishared.cx_translators-schema.sql.gz
-rw-r--r--  1 dump dump    1327959 Oct 18 14:06 wikishared.cx_translators.sql.gz
-rw-r--r--  1 dump dump        293 Oct 18 14:07 wikishared.echo_unread_wikis-schema.sql.gz
-rw-r--r--  1 dump dump  202307715 Oct 18 14:11 wikishared.echo_unread_wikis.sql.gz
-rw-r--r--  1 dump dump         91 Oct 18 13:34 wikishared-schema-create.sql.gz
-rw-r--r--  1 dump dump        235 Oct 18 14:07 wikishared.urlshortcodes-schema.sql.gz
-rw-r--r--  1 dump dump        198 Oct 18 14:06 wikishared.urlshortcodes.sql.gz

To investigate if we need to convert some of these away from uncompressed InnoDB on dbstore1001:

+---------------------------------------+----------+---------+---------+------------+---------+
| CONCAT(table_schema, '.', table_name) | rows     | DATA    | idx     | total_size | idxfrac |
+---------------------------------------+----------+---------+---------+------------+---------+
| wikidatawiki.wb_terms                 | 1264.24M | 128.83G | 264.32G | 393.15G    |    2.05 |
| commonswiki.recentchanges             | 630.76M  | 332.60G | 5.31G   | 337.92G    |    0.02 |
| commonswiki.templatelinks             | 1661.23M | 37.78G  | 199.39G | 237.17G    |    5.28 |
| wikidatawiki.logging                  | 580.59M  | 83.08G  | 147.38G | 230.46G    |    1.77 |
| enwiki.revision                       | 745.79M  | 95.49G  | 91.96G  | 187.44G    |    0.96 |
| cebwiki.templatelinks                 | 1317.61M | 59.47G  | 107.94G | 167.41G    |    1.82 |
| wikidatawiki.revision                 | 576.88M  | 84.11G  | 69.03G  | 153.13G    |    0.82 |
| enwiki.pagelinks                      | 1100.59M | 33.01G  | 119.57G | 152.58G    |    3.62 |
| metawiki.pagelinks                    | 905.48M  | 19.98G  | 105.98G | 125.96G    |    5.30 |
| ruwiki.recentchanges                  | 32.86M   | 79.13G  | 32.11G  | 111.24G    |    0.41 |
| commonswiki.logging                   | 236.27M  | 46.74G  | 63.68G  | 110.42G    |    1.36 |
| commonswiki.image                     | 42.81M   | 89.31G  | 14.77G  | 104.08G    |    0.17 |
| commonswiki.externallinks             | 283.16M  | 44.62G  | 58.71G  | 103.33G    |    1.32 |
| commonswiki.categorylinks             | 280.26M  | 35.03G  | 66.13G  | 101.16G    |    1.89 |
| commonswiki.globalimagelinks          | 350.91M  | 31.79G  | 63.48G  | 95.27G     |    2.00 |
| frwiki.recentchanges                  | 162.60M  | 88.57G  | 3.47G   | 92.03G     |    0.04 |
| mysql.gtid_slave_pos                  | 0.00M    | 84.98G  | 0.00G   | 84.98G     |    0.00 |
| enwiki.templatelinks                  | 694.31M  | 17.94G  | 64.52G  | 82.46G     |    3.60 |
| commonswiki.pagelinks                 | 421.13M  | 12.65G  | 68.01G  | 80.66G     |    5.38 |
| commonswiki.revision                  | 247.39M  | 41.15G  | 30.67G  | 71.82G     |    0.75 |
+---------------------------------------+----------+---------+---------+------------+---------+
20 rows in set (1 hour 5 min 38.49 sec)

On screen at dbstore1001 CC @Marostegui :

root@dbstore1001[cebwiki]> ALTER TABLE templatelinks ENGINE=TokuDB;

probably commonswiki.recentchanges and commonswiki.templatelinks will need an optimize, but not doing that for now.

Mentioned in SAL (#wikimedia-operations) [2017-10-26T17:09:06Z] <marostegui> Optimize ruwiki.recentchanges on dbstore1001 - T162789

Mentioned in SAL (#wikimedia-operations) [2017-10-27T05:55:03Z] <marostegui> dbstore1001: convert itwiki.pagelinks to TokuDB - T162789

I have optimize ruwiki.recentchanges on dbstore1001.

itwiki.pagelinks has been converted to TokuDB.

Mentioned in SAL (#wikimedia-operations) [2017-10-27T08:56:21Z] <marostegui> Optimize table commonswiki.recentchanges on dbstore1001 - T162789

On screen at dbstore1001 CC @Marostegui :

root@dbstore1001[cebwiki]> ALTER TABLE templatelinks ENGINE=TokuDB;

probably commonswiki.recentchanges and commonswiki.templatelinks will need an optimize, but not doing that for now.

commonswiki.recentchanges has been optimized on dbstore1001.

Mentioned in SAL (#wikimedia-operations) [2017-10-27T12:10:12Z] <marostegui> Optimize commonswiki.templatelinks on dbstore1001 - T162789

And commonswiki.templatelinks also optimized on dbstore1001

jcrespo moved this task from Pending comment to In progress on the DBA board.

This is happening at the same time than T184696

The previous ticket was closed, other things, like coordionaton and monitoring previously mentioned will be handled on separate tickets.