- find proved far too slow on a fresh rsync of the repos data. We used chmod -R phd:phd instead, accepting that everything is phd:phd and not some mix of phd:phd and phd:www-data
[x] check on phab1004 if any files under /srv/repos owned by GID 498 (aphlict). if so, give them to group phd
- find proved far too slow on a fresh rsync of the repos data. We used chmod -R phd:phd instead, accepting that everything is phd:phd and not some mix of phd:phd and phd:www-data
[x] check on phab1004 if any files under /srv/repos are owned by a user that is NOT phd
[phab1004:/] find /srv/repos ! -user phd
[x] expect this to show the PHEX repo but nothing else. decide what to do with PHEX (root-owned)
- Decision here: Only some stuff under here was root-owned, that seems likely to have been an artifact of some manual operation on phab1001
[x] output the full tree of /srv/repos and compare number of directories / files between both servers
[phab1001:/] tree -upfg > /root/repos-tree (this file will be just under 500MB of text)
[phab1001:/] tail /root/repos-tree
[phab1004:/] tree -upfg > /root/repos-tree
[phab1004:/] tail /root/repos-tree
[] optional: if not satisfied yet: copy result file from old server to new server (scp -3 ...) and run an actual diff between them
[x] set mysql ports for master and slave, specifically for eqiad (currently this happens in codfw but not in common hiera)
merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/859145 run-puppet-agent, check what happens on phab1004
[x] merge re-revert of the phabricator server name in common Hiera, run puppet, watch the changes on phab1004 and phab2002
merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/859628 and run-puppet-agent
[x] wait a couple minutes and check phd is still running (how long?)
(if killed by puppet for any reason, it'll be every puppet run...)
[x] merge re-revert of the DNS/SPF change
https://gerrit.wikimedia.org/r/c/operations/dns/+/860032 and run "authdns-update" on ns0.wikimedia.org, syncs to other DNS servers
[x] wait about a minute and optionally use "dig phabricator.discovery.wmnet @ns0.wikimedia.org" to see it change from alias for phab1001 to an alias for phab1004
[x] informational: dumps don't need to switch, they are already on phab1004, this has happened before
[x] informational: stats emails don't need to switch, they are already on phab1004, this has happened before
testing
[x] check https://phabricator.wikimedia.org works, watch out for yellow exclamation marks / warnings for admins
[x] test aphlict works by moving something on a workboard while someone else watches
[x] test if a ticket update shows up on IRC
[x] test if email from a ticket update arrives (by a user who has email notifications)
[x] check phabricator logs for exceptions (that aren't usual noise)
(insert command / pathes)
[x] test if CI works / "recheck" on a change in Gerrit
finalizing
[] merge patch to disable phd (and apache and php-fpm) on phab1001?
[x] verify proper monitoring downtime on phab1001
[x] reply to list emails and Slack that migration is done succesfully, link to ticket in case they see any issues
[x] publish fingerprints on wikitech page
after migration is done and grace period (how long?):
[x] double check which settings can move to common Hiera, remove setting from hosts files in Hiera
[] merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/824412 and check puppet run
[] remove phab1001 from mysql grants, coordinate with DBA on merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/858419
[x] create decom ticket for phab1001 - https://phabricator.wikimedia.org/T323418
[x] remove production puppet role from phab1001, merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/824804
[x] run decom cookbook from a cumin host on phab1001