Page MenuHomePhabricator

Data check es2020 after replication broke
Closed, ResolvedPublic


Followup of T327001

Event Timeline

To be done when dumps process finish to avoid overloading the production dbs (in approximately 24 hours).

If you want me to do it, just assign it to me!

It's a single command+wait so not much overhead, I prefer to mostly block it for now until backup finishes.

Sounds good, let me know if you need me :)

Marostegui moved this task from Done to In progress on the DBA board.

I "documented" how I did it in case it is useful and for sanity check:

1# gather the current last id from a "good" db
2root@cumin2002:~$ -BN -h es2021 information_schema -e "SELECT table_schema FROM tables WHERE table_name='blobs_cluster26'" | while read db; do echo -n "$db "; -BN -h es2021 $db -e "SELECT max(blob_id) FROM blobs_cluster26"; done > tables_to_check.txt
4# check growth on the largest wikis (e.g. Wikidata grew 3.8 million records since 14th Jan, enwiki 1 million)
5root@es2021:/srv/sqldata$ mysqlbinlog es2021-bin.008050 | less
7mysql:root@localhost [wikidatawiki]> select count(*) FROM blobs_cluster26 WHERE blob_id >= 354204267;
9| count(*) |
11| 3800281 |
131 row in set (1.668 sec)
15mysql:root@localhost [enwiki]> select count(*) FROM blobs_cluster26 WHERE blob_id >= 105902404;
17| count(*) |
19| 999381 |
211 row in set (0.424 sec)
23# check a buffer on all tables (this is all rows on all wikis except on the largest ones) of the latest 4 million ids
24root@cumin2002:~$ grep -v NULL tables_to_check.txt | while read db rows; do echo -e "\n== $db ==\n"; db-compare $db blobs_cl
25uster26 blob_id es1021 es2021 es2020 --step=100 --from-value=$(($rows - 4000000)) || break ; done

jcrespo changed the task status from Open to In Progress.Jan 25 2023, 10:47 AM

All tables resulted ok from the check, comparing eqiad, its codfw primary and itself on the last 4million rows of all tables.