Details
Event Timeline
To actually have users connect to new labsdb servers we're going to need
views created by maintain-replicas... Grants will need to be dealt with too.
Change 324386 had a related patch set uploaded (by Jcrespo):
mariadb: change check_private_data to print DROP statements
Change 324386 merged by Jcrespo:
mariadb: change check_private_data to print DROP statements
As discussed, can we get the DROP TABLE to be DROP TABLE IF EXISTS in order not to break replication in the future if for whatever reason the table do not exist already on the slaves?
Some auditing.
I have checked S3 in db1069 and db1095.
Database-wise a quick diff reveals that the following databases must be deleted from db1095 as they are present there but not in db1069:
This is a diff of mysql -e "show databases" between db1095 (single MySQL instance running S1 and S3) and db1069 (connecting to the S3 instance) removing enwikidatabase from the output.
So the following 76 databases ARE present in db1095 but not in db1069 so it should be removed probably
root@db1095:/home/marostegui# diff -u db1069_databases.txt db1095_databases.txt | grep "^+" +++ db1095_databases.txt 2016-11-30 09:33:32.528182772 +0000 +affcomwiki +arbcom_dewiki +arbcom_fiwiki +arbcom_nlwiki +auditcomwiki +bawiktionary +blocker +boardgovcomwiki +boards +boardvote +boardvote2005 +boardvote2006 +boardvote2007_test +boardvotetest +boardwiki +chairwiki +chapcomwiki +checkuserwiki +chwikimedia +closed_zh_twwiki +collabwiki +comcomwiki +de_labswikimedia +defoundation +devwikiinternal +dkwiki +dkwikibooks +dkwiktionary +ecwikimedia +en_labswikimedia +execwiki +fdcwiki +flaggedrevs_labswikimedia +grantswiki +iegcomwiki +ilwikimedia +internalwiki +jamestemp +katesdb +l10nwiki +langcomwiki +legalteamwiki +liquidthreads_labswikimedia +movementroleswiki +noboard_chapterswikimedia +noboardwiki +oai +officewiki +otrs_wikiwiki +projectcomwiki +ptwikimedia +readerfeedback_labswikimedia +rel13testwiki +ru_sibwiki +searchcomwiki +sep11wiki +spcomwiki +steward +stewardwiki +strategyappswiki +tlhwiki +tlhwiktionary +tokiponawiki +tokiponawikibooks +tokiponawikiquote +tokiponawiktionary +transitionteamwiki +vewikimedia +webshop +wikiconfig +wikimania +wikimaniateamwiki +zerowiki +zh_cnwiki +zh_twwiki
The check_private_data script suggest removing 79 databases and the diff between the ones I reported above and the ones that the script suggests removing is:
root@db1095:/home/marostegui# diff -u script_suggestions.txt normalized_diff_db1069_and_db1095.txt | grep "^-" --- script_suggestions.txt 2016-11-30 09:37:57.035624465 +0000 -arbcom_enwiki -ombudsmenwiki -test -wg_enwiki
So looks like the script is catching all the correct data plus the new databases that for some reason are in s3 and shouldn't be in the new sanitarium (reminder: db1095 was cloned from an existing s3 slave in eqiad)
Change 324390 had a related patch set uploaded (by Jcrespo):
mariadb: fix bugs with check_private_data regarding DROP and NULL
Change 324390 merged by Jcrespo:
mariadb: fix bugs with check_private_data regarding DROP and NULL
Change 324402 had a related patch set uploaded (by Jcrespo):
mariadb: More bugfixes for check_private_data.py
Mentioned in SAL (#wikimedia-operations) [2016-11-30T11:16:20Z] <marostegui> Stop replication s3 - db1095 - maintenance - T147052
I have deleted all the databases that the script suggest on db1095.
The only one I have left is test as it has some writes and it also exists on db1069.
I have started replication again. There is a bkup of the dbs deleted at: db1095:/srv/tmp/check_private_data/bkup just in case repl breaks.
A second run of the script now shows:
root@db1095:/srv/tmp/check_private_data/bkup# check_private_data.py | grep DATABASE DROP DATABASE IF EXISTS `test`
Change 324412 had a related patch set uploaded (by Jcrespo):
mariadb: Add semicolon after each SQL query output (private data check)
Change 324412 merged by Jcrespo:
mariadb: Add semicolon after each SQL query output (private data check)
I have been taking a look at the tables the script suggest to drop from each DB. It suggest a bunch of tables, but they are pretty much the same:
root@db1095:/home/marostegui# cat check_private_data_output.txt | grep TABLE | awk -F "." '{print $2}' | tr -d "\`" | sort | uniq -c
6 accountaudit_login
1 arbcom1_vote
725 blob_orphans
725 blob_tracking
4 bot_passwords
731 bv2009_edits
807 click_tracking
831 cu_changes
831 cu_log
568 cur
520 echo_email_batch
520 echo_event
520 echo_notification
2 echo_target_page
778 edit_page_tracking
1 exarchive
1 exrevision
831 filejournal
822 hidden
1 image_old
831 job
568 linkscc
1 logging_old
831 log_search
1 long_run_profiling
779 moodbar_feedback
782 moodbar_feedback_response
826 msg_resource
6 oathauth_users
831 objectcache
1 old_growth
1 oldimage_old
789 prefstats
746 prefswitch_survey
798 pr_index
3 profiling
831 querycache
831 querycache_info
831 querycachetwo
831 securepoll_cookie_match
831 securepoll_elections
831 securepoll_entity
831 securepoll_lists
831 securepoll_msgs
831 securepoll_options
831 securepoll_properties
831 securepoll_questions
831 securepoll_strike
831 securepoll_voters
831 securepoll_votes
831 spoofuser
831 text
822 titlekey
831 transcache
831 uploadstash
831 user_newtalk
1 vote_log
831 watchlistSo getting the list of unique tables, we can see that all of them appear on the replication filters:
root@db1095:/home/marostegui# for i in `cat check_private_data_output.txt | grep TABLE | awk -F "." '{print $2}' | tr -d '\`' | sort | uniq`; do echo $i; grep -iwE "%.$i" /etc/my.cnf | grep replicate;done
accountaudit_login
replicate-wild-ignore-table = %.accountaudit_login
arbcom1_vote
replicate-wild-ignore-table = %.arbcom1_vote
blob_orphans
replicate-wild-ignore-table = %.blob_orphans
blob_tracking
replicate-wild-ignore-table = %.blob_tracking
bot_passwords
replicate-wild-ignore-table = %.bot_passwords
bv2009_edits
replicate-wild-ignore-table = %.bv2009_edits
click_tracking
replicate-wild-ignore-table = %.click_tracking
cu_changes
replicate-wild-ignore-table = %.cu_changes
cu_log
replicate-wild-ignore-table = %.cu_log
cur
replicate-wild-ignore-table = %.cur
echo_email_batch
replicate-wild-ignore-table = %.echo_email_batch
echo_event
replicate-wild-ignore-table = %.echo_event
echo_notification
replicate-wild-ignore-table = %.echo_notification
echo_target_page
replicate-wild-ignore-table = %.echo_target_page
edit_page_tracking
replicate-wild-ignore-table = %.edit_page_tracking
exarchive
replicate-wild-ignore-table = %.exarchive
exrevision
replicate-wild-ignore-table = %.exrevision
filejournal
replicate-wild-ignore-table = %.filejournal
hidden
replicate-wild-ignore-table = %.hidden
image_old
replicate-wild-ignore-table = %.image_old
job
replicate-wild-ignore-table = %.job
linkscc
replicate-wild-ignore-table = %.linkscc
logging_old
replicate-wild-ignore-table = %.logging_old
log_search
replicate-wild-ignore-table = %.log_search
long_run_profiling
replicate-wild-ignore-table = %.long_run_profiling
moodbar_feedback
replicate-wild-ignore-table = %.moodbar_feedback
moodbar_feedback_response
replicate-wild-ignore-table = %.moodbar_feedback_response
msg_resource
replicate-wild-ignore-table = %.msg_resource
oathauth_users
replicate-wild-ignore-table = %.oathauth_users
objectcache
replicate-wild-ignore-table = %.objectcache
old_growth
replicate-wild-ignore-table = %.old_growth
oldimage_old
replicate-wild-ignore-table = %.oldimage_old
prefstats
replicate-wild-ignore-table = %.prefstats
prefswitch_survey
replicate-wild-ignore-table = %.prefswitch_survey
pr_index
replicate-wild-ignore-table = %.pr_index
profiling
replicate-wild-ignore-table = %.profiling
querycache
replicate-wild-ignore-table = %.querycache
querycache_info
replicate-wild-ignore-table = %.querycache_info
querycachetwo
replicate-wild-ignore-table = %.querycachetwo
securepoll_cookie_match
replicate-wild-ignore-table = %.securepoll_cookie_match
securepoll_elections
replicate-wild-ignore-table = %.securepoll_elections
securepoll_entity
replicate-wild-ignore-table = %.securepoll_entity
securepoll_lists
replicate-wild-ignore-table = %.securepoll_lists
securepoll_msgs
replicate-wild-ignore-table = %.securepoll_msgs
securepoll_options
replicate-wild-ignore-table = %.securepoll_options
securepoll_properties
replicate-wild-ignore-table = %.securepoll_properties
securepoll_questions
replicate-wild-ignore-table = %.securepoll_questions
securepoll_strike
replicate-wild-ignore-table = %.securepoll_strike
securepoll_voters
replicate-wild-ignore-table = %.securepoll_voters
securepoll_votes
replicate-wild-ignore-table = %.securepoll_votes
spoofuser
replicate-wild-ignore-table = %.spoofuser
text
replicate-wild-ignore-table = %.text
titlekey
replicate-wild-ignore-table = %.titlekey
transcache
replicate-wild-ignore-table = %.transcache
uploadstash
replicate-wild-ignore-table = %.uploadstash
user_newtalk
replicate-wild-ignore-table = %.user_newtalk
vote_log
replicate-wild-ignore-table = %.vote_log
watchlist
replicate-wild-ignore-table = %.watchlistSo it looks like the script is doing the right thing, or at least isn't suggesting to delete anything that doesn't appear on the replication filters.
I have dropped all the tables suggested by the script and started s3 replication again.
Change 328168 had a related patch set uploaded (by Jcrespo):
labsdb: Enable view creation on labsdb1009/10/11