⚓ T133566 Reinstall and data reload of WDQS servers

Subject	Repo	Branch	Lines +/-
Revert "Depooled wdqs1001 during reinstall"	operations/puppet	production	+2 -1
WDQS - Smaller /var/lib/wdqs partition	operations/puppet	production	+1 -1
Modify partitions to reflect new disk added in WDQS nodes	operations/puppet	production	+1 -1
Depooled wdqs1001 during reinstall	operations/puppet	production	+1 -2

Status	Assigned	Task
Resolved	Smalyshev	T123565 [EPIC] Support geo-coordinate search for WDQS
Resolved	Gehel	T133566 Reinstall and data reload of WDQS servers
Resolved	Smalyshev	T133986 Failure of wdqs-updater after data import
Resolved	• Cmjohnson	T120712 install two Intel 320 Series SSDSA2CW300G3 2.5" 300GB each in wdqs1001/wdqs1002
Resolved	Smalyshev	T120714 implement wdqs1001/1002 disk upgrades (extend lvm)

Gehel created this task.Apr 25 2016, 5:20 PM

Planned sequence:

(day before) Send email to the wikidata list
Take wdq1001 out of varnish config
Shut down and reimage wdq1001. Verify disk partitioning is correct.
Deploy new code from wdq-deploy repo. Do NOT restart wdq1002 yet!
Reload data to wdq1001 from https://dumps.wikimedia.org/wikidatawiki/entities/20160425/ dump ttl-gz version (should be ready by then)
Start updater on wdq1001 and wait for it to catch up
Re-add wdq1001 to varnish, verify it's ready to serve requests
Disable updater or wdq1002
Put wdq1002 into maintenance mode (no need to take it out of varnish as we are only reloading data, not reimaging)
Reload wdq1002 data from the same dump as above.
Re-enable updater on wdq1002 and wait until it catches up
Remove maintenance mode from wdq1002
Verify everything works fine and queries run on both servers
Send the victory email to wikidata
PROFIT!

Smalyshev moved this task from Needs triage to Ops on the Discovery-ARCHIVED board.Apr 26 2016, 5:27 AM

Change 285345 had a related patch set uploaded (by Gehel):
Depooled wdqs1001 during reinstall

https://gerrit.wikimedia.org/r/285345

gerritbot added a project: Patch-For-Review.Apr 26 2016, 8:48 AM

Change 285345 merged by Gehel:
Depooled wdqs1001 during reinstall

https://gerrit.wikimedia.org/r/285345

Change 285353 had a related patch set uploaded (by Gehel):
Modify partitions to reflect new disk added in WDQS nodes

https://gerrit.wikimedia.org/r/285353

Change 285353 merged by Gehel:
Modify partitions to reflect new disk added in WDQS nodes

https://gerrit.wikimedia.org/r/285353

Mentioned in SAL [2016-04-26T09:50:10Z] <gehel> starting reinstall of wdqs1001 (T133566)

While rebuilding the RAID to add new disks, I realized wdqs1001 has 2x 300GB + 2x 150GB disks. I'm reinstalling anyway to ensure we don't run on a single node, but it does not look like what was planned in T119579 / T120712. i'll check with @RobH and/or @Cmjohnson when they arrive.