restbase201[3-8] are up and running RESTBase. We should add them to the appropriate places (conftool, LVS).
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Unknown Object (Task) | |||||
Unknown Object (Task) | |||||
Resolved | fgiunchedi | T209615 rack/setup/install restbase201[3-8].codfw.wmnet | |||
Resolved | fgiunchedi | T211416 Put restbase201[3-8] into conftool and LVS |
Event Timeline
Indeed, FWIW I tend to treat restbase and cassandra separate so this will be done as soon as the cassandra reshape (T210843) is done.
They are independent, though. Cassandra doesn't go into LVS at all, and RESTBase is fully functional on these nodes. Not having them in LVS/conftool does not allow us to deploy fresh code to them, so that is problematic.
I didn't take into account missing deploys/updates -- I'll add the hosts to lvs/etc even if depooled so deploys work as expected
Change 478672 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] conftool: add restbase10[3-8]
Change 478673 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: add restbase10[3-8] to restbase
Change 478672 merged by Filippo Giunchedi:
[operations/puppet@production] conftool: add restbase20[3-8]
Change 478869 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] restbase: remove production_ng remnants
Change 478870 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: replace restbase seeds in codfw
Change 478673 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: add restbase20[3-8] to restbase
Change 478869 merged by Filippo Giunchedi:
[operations/puppet@production] restbase: remove production_ng remnants
Change 478870 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: replace restbase seeds in codfw
Mentioned in SAL (#wikimedia-operations) [2018-12-11T14:08:38Z] <mobrovac@deploy1001> Started deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date - T211416
Mentioned in SAL (#wikimedia-operations) [2018-12-11T14:10:31Z] <mobrovac@deploy1001> Finished deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date - T211416 (duration: 01m 53s)
Turns out depool-restbase isn't successful:
root@restbase2014:~# depool-restbase root@restbase2014:~# echo $? 2 root@restbase2014:~# cat `which depool-restbase` #!/bin/sh /usr/local/bin/pooler-loop --depool --lvs-ips 10.192.1.3,10.192.17.6 --pool-name restbase_7231 service=restbase
pooler-loop has no verbose mode, but adding a debugging puts turns out the command being called is /usr/local/bin/depool service=restbase which in turn fails silently:
root@restbase2014:~# /usr/local/bin/depool service=restbase Depooling restbase on restbase2014.codfw.wmnet root@restbase2014:~# echo $? 3
After a set -x this time:
root@restbase2014:~# /usr/local/bin/depool service=restbase ++ basename /usr/local/bin/depool + action=depool + _service=service=restbase + service=restbase + [[ restbase =~ = ]] ++ hostname -f + host=restbase2014.codfw.wmnet + old_action= + msg= + case $action in + old_action=set/pooled=no + msg='Depooling restbase on restbase2014.codfw.wmnet' + echo 'Depooling restbase on restbase2014.codfw.wmnet' Depooling restbase on restbase2014.codfw.wmnet + do_action restbase + '[' restbase == 'all services' ']' + confctl --quiet depool --service restbase + retval=3 + '[' 3 -eq 2 ']' + exit 3
And finally confctl throws an exception:
root@restbase2014:~# confctl --quiet depool --service restbase CRITICAL:conftool:Could not load driver etcd: No module named 'dns' Traceback (most recent call last): File "/usr/lib/python3/dist-packages/conftool/backend.py", line 15, in __init__ exec(compile(open(driver_file).read(), driver_file, 'exec'), ctx) File "/usr/lib/python3/dist-packages/conftool/drivers/etcd.py", line 4, in <module> import etcd File "/usr/lib/python3/dist-packages/etcd/__init__.py", line 2, in <module> from .client import Client File "/usr/lib/python3/dist-packages/etcd/client.py", line 21, in <module> import dns.resolver ImportError: No module named 'dns'
"fixed" for now by manually installing python-dnspython, following up on T209136 for a proper fix
Mentioned in SAL (#wikimedia-operations) [2018-12-12T10:40:17Z] <mobrovac@deploy1001> Started deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2 - T211416
Mentioned in SAL (#wikimedia-operations) [2018-12-12T10:40:32Z] <mobrovac@deploy1001> Finished deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2 - T211416 (duration: 00m 15s)
Mentioned in SAL (#wikimedia-operations) [2018-12-12T10:41:22Z] <mobrovac@deploy1001> Started deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2b - T211416
Mentioned in SAL (#wikimedia-operations) [2018-12-12T10:51:33Z] <mobrovac@deploy1001> Finished deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2b - T211416 (duration: 10m 11s)
Change 479182 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/restbase/deploy@master] Revert "Revert "Scap: Add restbase201[3-8] to targets""
Change 479182 merged by Mobrovac:
[mediawiki/services/restbase/deploy@master] Revert "Revert "Scap: Add restbase201[3-8] to targets""
Thank you, @fgiunchedi for setting this up and debugging the problems encountered with EtcD.
Mentioned in SAL (#wikimedia-operations) [2018-12-12T14:07:52Z] <mobrovac@deploy1001> Started restart [restbase/deploy@5946231]: Restart RB to pick up the new seeds in codfw - T211416