Page MenuHomePhabricator

Provision with data the new labsdb servers and provide replica service with at least 1 shard from a sanitized copy from production
Closed, ResolvedPublic

Event Timeline

To actually have users connect to new labsdb servers we're going to need
views created by maintain-replicas... Grants will need to be dealt with too.

jcrespo moved this task from Triage to Backlog on the DBA board.Nov 17 2016, 3:20 PM

Change 324386 had a related patch set uploaded (by Jcrespo):
mariadb: change check_private_data to print DROP statements

https://gerrit.wikimedia.org/r/324386

Change 324386 merged by Jcrespo:
mariadb: change check_private_data to print DROP statements

https://gerrit.wikimedia.org/r/324386

Change 324386 merged by Jcrespo:
mariadb: change check_private_data to print DROP statements
https://gerrit.wikimedia.org/r/324386

As discussed, can we get the DROP TABLE to be DROP TABLE IF EXISTS in order not to break replication in the future if for whatever reason the table do not exist already on the slaves?

Some auditing.

I have checked S3 in db1069 and db1095.

Database-wise a quick diff reveals that the following databases must be deleted from db1095 as they are present there but not in db1069:

This is a diff of mysql -e "show databases" between db1095 (single MySQL instance running S1 and S3) and db1069 (connecting to the S3 instance) removing enwikidatabase from the output.
So the following 76 databases ARE present in db1095 but not in db1069 so it should be removed probably

root@db1095:/home/marostegui# diff -u db1069_databases.txt db1095_databases.txt  | grep "^+"
+++ db1095_databases.txt	2016-11-30 09:33:32.528182772 +0000
+affcomwiki
+arbcom_dewiki
+arbcom_fiwiki
+arbcom_nlwiki
+auditcomwiki
+bawiktionary
+blocker
+boardgovcomwiki
+boards
+boardvote
+boardvote2005
+boardvote2006
+boardvote2007_test
+boardvotetest
+boardwiki
+chairwiki
+chapcomwiki
+checkuserwiki
+chwikimedia
+closed_zh_twwiki
+collabwiki
+comcomwiki
+de_labswikimedia
+defoundation
+devwikiinternal
+dkwiki
+dkwikibooks
+dkwiktionary
+ecwikimedia
+en_labswikimedia
+execwiki
+fdcwiki
+flaggedrevs_labswikimedia
+grantswiki
+iegcomwiki
+ilwikimedia
+internalwiki
+jamestemp
+katesdb
+l10nwiki
+langcomwiki
+legalteamwiki
+liquidthreads_labswikimedia
+movementroleswiki
+noboard_chapterswikimedia
+noboardwiki
+oai
+officewiki
+otrs_wikiwiki
+projectcomwiki
+ptwikimedia
+readerfeedback_labswikimedia
+rel13testwiki
+ru_sibwiki
+searchcomwiki
+sep11wiki
+spcomwiki
+steward
+stewardwiki
+strategyappswiki
+tlhwiki
+tlhwiktionary
+tokiponawiki
+tokiponawikibooks
+tokiponawikiquote
+tokiponawiktionary
+transitionteamwiki
+vewikimedia
+webshop
+wikiconfig
+wikimania
+wikimaniateamwiki
+zerowiki
+zh_cnwiki
+zh_twwiki

The check_private_data script suggest removing 79 databases and the diff between the ones I reported above and the ones that the script suggests removing is:

root@db1095:/home/marostegui# diff -u script_suggestions.txt normalized_diff_db1069_and_db1095.txt  | grep "^-"
--- script_suggestions.txt	2016-11-30 09:37:57.035624465 +0000
-arbcom_enwiki
-ombudsmenwiki
-test
-wg_enwiki

So looks like the script is catching all the correct data plus the new databases that for some reason are in s3 and shouldn't be in the new sanitarium (reminder: db1095 was cloned from an existing s3 slave in eqiad)

Change 324390 had a related patch set uploaded (by Jcrespo):
mariadb: fix bugs with check_private_data regarding DROP and NULL

https://gerrit.wikimedia.org/r/324390

Change 324390 merged by Jcrespo:
mariadb: fix bugs with check_private_data regarding DROP and NULL

https://gerrit.wikimedia.org/r/324390

Change 324402 had a related patch set uploaded (by Jcrespo):
mariadb: More bugfixes for check_private_data.py

https://gerrit.wikimedia.org/r/324402

Change 324402 merged by Jcrespo:
mariadb: More bugfixes for check_private_data.py

https://gerrit.wikimedia.org/r/324402

Mentioned in SAL (#wikimedia-operations) [2016-11-30T11:16:20Z] <marostegui> Stop replication s3 - db1095 - maintenance - T147052

I have deleted all the databases that the script suggest on db1095.
The only one I have left is test as it has some writes and it also exists on db1069.

I have started replication again. There is a bkup of the dbs deleted at: db1095:/srv/tmp/check_private_data/bkup just in case repl breaks.

A second run of the script now shows:

root@db1095:/srv/tmp/check_private_data/bkup# check_private_data.py | grep DATABASE
DROP DATABASE IF EXISTS `test`

Change 324412 had a related patch set uploaded (by Jcrespo):
mariadb: Add semicolon after each SQL query output (private data check)

https://gerrit.wikimedia.org/r/324412

Change 324412 merged by Jcrespo:
mariadb: Add semicolon after each SQL query output (private data check)

https://gerrit.wikimedia.org/r/324412

I have been taking a look at the tables the script suggest to drop from each DB. It suggest a bunch of tables, but they are pretty much the same:

root@db1095:/home/marostegui# cat check_private_data_output.txt  | grep TABLE | awk -F "." '{print $2}' | tr -d "\`" | sort | uniq -c
      6 accountaudit_login
      1 arbcom1_vote
    725 blob_orphans
    725 blob_tracking
      4 bot_passwords
    731 bv2009_edits
    807 click_tracking
    831 cu_changes
    831 cu_log
    568 cur
    520 echo_email_batch
    520 echo_event
    520 echo_notification
      2 echo_target_page
    778 edit_page_tracking
      1 exarchive
      1 exrevision
    831 filejournal
    822 hidden
      1 image_old
    831 job
    568 linkscc
      1 logging_old
    831 log_search
      1 long_run_profiling
    779 moodbar_feedback
    782 moodbar_feedback_response
    826 msg_resource
      6 oathauth_users
    831 objectcache
      1 old_growth
      1 oldimage_old
    789 prefstats
    746 prefswitch_survey
    798 pr_index
      3 profiling
    831 querycache
    831 querycache_info
    831 querycachetwo
    831 securepoll_cookie_match
    831 securepoll_elections
    831 securepoll_entity
    831 securepoll_lists
    831 securepoll_msgs
    831 securepoll_options
    831 securepoll_properties
    831 securepoll_questions
    831 securepoll_strike
    831 securepoll_voters
    831 securepoll_votes
    831 spoofuser
    831 text
    822 titlekey
    831 transcache
    831 uploadstash
    831 user_newtalk
      1 vote_log
    831 watchlist

So getting the list of unique tables, we can see that all of them appear on the replication filters:

root@db1095:/home/marostegui# for i in `cat check_private_data_output.txt  | grep TABLE | awk -F "." '{print $2}' | tr -d '\`' | sort | uniq`; do echo $i; grep -iwE "%.$i" /etc/my.cnf | grep replicate;done
accountaudit_login
replicate-wild-ignore-table = %.accountaudit_login
arbcom1_vote
replicate-wild-ignore-table = %.arbcom1_vote
blob_orphans
replicate-wild-ignore-table = %.blob_orphans
blob_tracking
replicate-wild-ignore-table = %.blob_tracking
bot_passwords
replicate-wild-ignore-table = %.bot_passwords
bv2009_edits
replicate-wild-ignore-table = %.bv2009_edits
click_tracking
replicate-wild-ignore-table = %.click_tracking
cu_changes
replicate-wild-ignore-table = %.cu_changes
cu_log
replicate-wild-ignore-table = %.cu_log
cur
replicate-wild-ignore-table = %.cur
echo_email_batch
replicate-wild-ignore-table = %.echo_email_batch
echo_event
replicate-wild-ignore-table = %.echo_event
echo_notification
replicate-wild-ignore-table = %.echo_notification
echo_target_page
replicate-wild-ignore-table = %.echo_target_page
edit_page_tracking
replicate-wild-ignore-table = %.edit_page_tracking
exarchive
replicate-wild-ignore-table = %.exarchive
exrevision
replicate-wild-ignore-table = %.exrevision
filejournal
replicate-wild-ignore-table = %.filejournal
hidden
replicate-wild-ignore-table = %.hidden
image_old
replicate-wild-ignore-table = %.image_old
job
replicate-wild-ignore-table = %.job
linkscc
replicate-wild-ignore-table = %.linkscc
logging_old
replicate-wild-ignore-table = %.logging_old
log_search
replicate-wild-ignore-table = %.log_search
long_run_profiling
replicate-wild-ignore-table = %.long_run_profiling
moodbar_feedback
replicate-wild-ignore-table = %.moodbar_feedback
moodbar_feedback_response
replicate-wild-ignore-table = %.moodbar_feedback_response
msg_resource
replicate-wild-ignore-table = %.msg_resource
oathauth_users
replicate-wild-ignore-table = %.oathauth_users
objectcache
replicate-wild-ignore-table = %.objectcache
old_growth
replicate-wild-ignore-table = %.old_growth
oldimage_old
replicate-wild-ignore-table = %.oldimage_old
prefstats
replicate-wild-ignore-table = %.prefstats
prefswitch_survey
replicate-wild-ignore-table = %.prefswitch_survey
pr_index
replicate-wild-ignore-table = %.pr_index
profiling
replicate-wild-ignore-table = %.profiling
querycache
replicate-wild-ignore-table = %.querycache
querycache_info
replicate-wild-ignore-table = %.querycache_info
querycachetwo
replicate-wild-ignore-table = %.querycachetwo
securepoll_cookie_match
replicate-wild-ignore-table = %.securepoll_cookie_match
securepoll_elections
replicate-wild-ignore-table = %.securepoll_elections
securepoll_entity
replicate-wild-ignore-table = %.securepoll_entity
securepoll_lists
replicate-wild-ignore-table = %.securepoll_lists
securepoll_msgs
replicate-wild-ignore-table = %.securepoll_msgs
securepoll_options
replicate-wild-ignore-table = %.securepoll_options
securepoll_properties
replicate-wild-ignore-table = %.securepoll_properties
securepoll_questions
replicate-wild-ignore-table = %.securepoll_questions
securepoll_strike
replicate-wild-ignore-table = %.securepoll_strike
securepoll_voters
replicate-wild-ignore-table = %.securepoll_voters
securepoll_votes
replicate-wild-ignore-table = %.securepoll_votes
spoofuser
replicate-wild-ignore-table = %.spoofuser
text
replicate-wild-ignore-table = %.text
titlekey
replicate-wild-ignore-table = %.titlekey
transcache
replicate-wild-ignore-table = %.transcache
uploadstash
replicate-wild-ignore-table = %.uploadstash
user_newtalk
replicate-wild-ignore-table = %.user_newtalk
vote_log
replicate-wild-ignore-table = %.vote_log
watchlist
replicate-wild-ignore-table = %.watchlist

So it looks like the script is doing the right thing, or at least isn't suggesting to delete anything that doesn't appear on the replication filters.

I have dropped all the tables suggested by the script and started s3 replication again.

Change 328168 had a related patch set uploaded (by Jcrespo):
labsdb: Enable view creation on labsdb1009/10/11

https://gerrit.wikimedia.org/r/328168

Change 328168 merged by Jcrespo:
labsdb: Enable view creation on labsdb1009/10/11

https://gerrit.wikimedia.org/r/328168

chasemp closed this task as Resolved.Dec 21 2016, 4:40 PM
chasemp claimed this task.

enwiki_p is now functionally available on the new labsdb servers