Task for actual migration, need to coordinate with Analytics/Ezachte
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | bd808 | T166402 Program 7 Outcome 3: data services | |||
Resolved | ArielGlenn | T182540 get datset1001, ms1001 ready for decommission | |||
Resolved | • madhuvishy | T168486 Migrate customer-facing Dumps endpoints to Cloud Services | |||
Resolved | • madhuvishy | T188644 Migrate the stat* mount from dataset1001 to labstore1006/7 |
Event Timeline
Change 420083 had a related patch set uploaded (by Madhuvishy; owner: Madhuvishy):
[operations/puppet@production] [WIP] statistics: Migrate dumps mount to labstore1006|7
Change 422892 had a related patch set uploaded (by Madhuvishy; owner: Madhuvishy):
[operations/puppet@production] statistics: Absent existing dumps mount at /mnt/data
Change 422896 had a related patch set uploaded (by Madhuvishy; owner: Madhuvishy):
[operations/puppet@production] statistics: Symlink /mnt/data to nfs share from active server
Change 420083 merged by Madhuvishy:
[operations/puppet@production] statistics: Mount dumps share from labstore1006|7 on stat1005|6
Change 423515 had a related patch set uploaded (by Madhuvishy; owner: Madhuvishy):
[operations/puppet@production] statistics: Add missed README for dumps nfs mount
Change 423515 merged by Madhuvishy:
[operations/puppet@production] statistics: Add missed README for dumps nfs mount
Change 422892 merged by Madhuvishy:
[operations/puppet@production] statistics: Absent existing dumps mount at /mnt/data
Change 422896 merged by Madhuvishy:
[operations/puppet@production] statistics: Symlink /mnt/data to nfs share from active server
Notes from migration plan:
- Announcements
- [DONE] Last minute analytics mailing list update
- Silence monitoring -- need to check if there's anything set up on stat*
- [DONE] Downtimed puppet run alerts for stat1005|6
- [DONE] Mount NFS shares from labstore1006 & 7 on stat1005|6 at /mnt/nfs/labstore1006-dumps & /mnt/nfs/labstore1007-dumps
- Disable puppet on stat1005|6
- Merge puppet patch - https://gerrit.wikimedia.org/r/#/c/420083/
- Apply patch on stat* and test
- [NONE RUNNING] Kill any processes actively accessing /mnt/data
- find with lsof +f -- /mnt/data
- [DONE] Absent NFS mount at /mnt/data (served from dataset1001)
- Disable puppet on stat1005|6
- Merge puppet patch - https://gerrit.wikimedia.org/r/#/c/422892/
- Apply patch on stat* and test
- Puppet refused to remove directory /mnt/data without --force. Manually ran -- madhuvishy@stat1005:/mnt$ sudo rm -r data
- [DONE] Set up symlink to /mnt/data from active NFS mount for labstore1007
- Disable puppet on stat1005|6
- Merge puppet patch - https://gerrit.wikimedia.org/r/#/c/422896/
- Apply patch on stat* and test
Success Criteria:
- [SUCCESS] stat1005 & 6 can sucessfully read from /mnt/data
- head -n 1 /mnt/data/xmldatadumps/public/liwiki/latest/liwiki-latest-pages-articles.xml.bz2-rss.xml
- Labstore1007 - load on the server is normal (monitor for a couple hours atleast) -- All good right now, nothing much is going on though
Post (if success):
- Announce all clear to mailing lists
- Clean up dataset1001 dumps mount code in statistics::dataset_mount -- Done with https://gerrit.wikimedia.org/r/#/c/422896/
- Remove the dumps NFS export from dataset1001 --
Rollback
- Kill any processes actively accessing /mnt/data
- find with lsof +f -- /mnt/data
- [UNDO] Set up symlink to /mnt/data from active NFS mount for labstore1007
- Disable puppet on stat1005|6
- Revert puppet patch - https://gerrit.wikimedia.org/r/#/c/422896/
- Apply patch on stat* and test
- [POSSIBLY UNDO] Mount NFS shares from labstore1006 & 7 on stat1005|6 at /mnt/nfs/labstore1006-dumps & /mnt/nfs/labstore1007-dumps
- Disable puppet on stat1005|6
- Revert puppet patch - https://gerrit.wikimedia.org/r/#/c/420083/
- Apply patch on stat* and test
- [UNDO] Absent NFS mount at /mnt/data (served from dataset1001)
- Disable puppet on stat1005|6
- Revert puppet patch - https://gerrit.wikimedia.org/r/#/c/422892/
- Apply patch on stat* and test
Change 423733 had a related patch set uploaded (by Madhuvishy; owner: Madhuvishy):
[operations/puppet@production] dumps: Remove stat1005|6 from nfs clients for dataset1001
Change 423733 merged by Madhuvishy:
[operations/puppet@production] dumps: Remove stat1005|6 from nfs clients for dataset1001