Page MenuHomePhabricator

Put restbase201[3-8] into conftool and LVS
Closed, ResolvedPublic

Description

restbase201[3-8] are up and running RESTBase. We should add them to the appropriate places (conftool, LVS).

Event Timeline

mobrovac created this task.

Indeed, FWIW I tend to treat restbase and cassandra separate so this will be done as soon as the cassandra reshape (T210843) is done.

They are independent, though. Cassandra doesn't go into LVS at all, and RESTBase is fully functional on these nodes. Not having them in LVS/conftool does not allow us to deploy fresh code to them, so that is problematic.

They are independent, though. Cassandra doesn't go into LVS at all, and RESTBase is fully functional on these nodes. Not having them in LVS/conftool does not allow us to deploy fresh code to them, so that is problematic.

I didn't take into account missing deploys/updates -- I'll add the hosts to lvs/etc even if depooled so deploys work as expected

Change 478672 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] conftool: add restbase10[3-8]

https://gerrit.wikimedia.org/r/478672

Change 478673 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: add restbase10[3-8] to restbase

https://gerrit.wikimedia.org/r/478673

Change 478672 merged by Filippo Giunchedi:
[operations/puppet@production] conftool: add restbase20[3-8]

https://gerrit.wikimedia.org/r/478672

Change 478869 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] restbase: remove production_ng remnants

https://gerrit.wikimedia.org/r/478869

Change 478870 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: replace restbase seeds in codfw

https://gerrit.wikimedia.org/r/478870

Change 478673 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: add restbase20[3-8] to restbase

https://gerrit.wikimedia.org/r/478673

Change 478869 merged by Filippo Giunchedi:
[operations/puppet@production] restbase: remove production_ng remnants

https://gerrit.wikimedia.org/r/478869

Change 478870 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: replace restbase seeds in codfw

https://gerrit.wikimedia.org/r/478870

Mentioned in SAL (#wikimedia-operations) [2018-12-11T14:08:38Z] <mobrovac@deploy1001> Started deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date - T211416

Mentioned in SAL (#wikimedia-operations) [2018-12-11T14:10:31Z] <mobrovac@deploy1001> Finished deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date - T211416 (duration: 01m 53s)

Turns out depool-restbase isn't successful:

root@restbase2014:~# depool-restbase 
root@restbase2014:~# echo $?
2
root@restbase2014:~# cat `which depool-restbase`
#!/bin/sh
/usr/local/bin/pooler-loop --depool --lvs-ips 10.192.1.3,10.192.17.6 --pool-name restbase_7231 service=restbase

pooler-loop has no verbose mode, but adding a debugging puts turns out the command being called is /usr/local/bin/depool service=restbase which in turn fails silently:

root@restbase2014:~# /usr/local/bin/depool service=restbase
Depooling restbase on restbase2014.codfw.wmnet
root@restbase2014:~# echo $?
3

After a set -x this time:

root@restbase2014:~# /usr/local/bin/depool service=restbase
++ basename /usr/local/bin/depool
+ action=depool
+ _service=service=restbase
+ service=restbase
+ [[ restbase =~ = ]]
++ hostname -f
+ host=restbase2014.codfw.wmnet
+ old_action=
+ msg=
+ case $action in
+ old_action=set/pooled=no
+ msg='Depooling restbase on restbase2014.codfw.wmnet'
+ echo 'Depooling restbase on restbase2014.codfw.wmnet'
Depooling restbase on restbase2014.codfw.wmnet
+ do_action restbase
+ '[' restbase == 'all services' ']'
+ confctl --quiet depool --service restbase
+ retval=3
+ '[' 3 -eq 2 ']'
+ exit 3

And finally confctl throws an exception:

root@restbase2014:~# confctl --quiet depool --service restbase
CRITICAL:conftool:Could not load driver etcd: No module named 'dns'
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/conftool/backend.py", line 15, in __init__
    exec(compile(open(driver_file).read(), driver_file, 'exec'), ctx)
  File "/usr/lib/python3/dist-packages/conftool/drivers/etcd.py", line 4, in <module>
    import etcd
  File "/usr/lib/python3/dist-packages/etcd/__init__.py", line 2, in <module>
    from .client import Client
  File "/usr/lib/python3/dist-packages/etcd/client.py", line 21, in <module>
    import dns.resolver
ImportError: No module named 'dns'

And finally confctl throws an exception:

root@restbase2014:~# confctl --quiet depool --service restbase
CRITICAL:conftool:Could not load driver etcd: No module named 'dns'
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/conftool/backend.py", line 15, in __init__
    exec(compile(open(driver_file).read(), driver_file, 'exec'), ctx)
  File "/usr/lib/python3/dist-packages/conftool/drivers/etcd.py", line 4, in <module>
    import etcd
  File "/usr/lib/python3/dist-packages/etcd/__init__.py", line 2, in <module>
    from .client import Client
  File "/usr/lib/python3/dist-packages/etcd/client.py", line 21, in <module>
    import dns.resolver
ImportError: No module named 'dns'

Known: T209136

"fixed" for now by manually installing python-dnspython, following up on T209136 for a proper fix

Mentioned in SAL (#wikimedia-operations) [2018-12-12T10:40:17Z] <mobrovac@deploy1001> Started deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2 - T211416

Mentioned in SAL (#wikimedia-operations) [2018-12-12T10:40:32Z] <mobrovac@deploy1001> Finished deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2 - T211416 (duration: 00m 15s)

Mentioned in SAL (#wikimedia-operations) [2018-12-12T10:41:22Z] <mobrovac@deploy1001> Started deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2b - T211416

Mentioned in SAL (#wikimedia-operations) [2018-12-12T10:51:33Z] <mobrovac@deploy1001> Finished deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2b - T211416 (duration: 10m 11s)

Change 479182 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/restbase/deploy@master] Revert "Revert "Scap: Add restbase201[3-8] to targets""

https://gerrit.wikimedia.org/r/479182

Change 479182 merged by Mobrovac:
[mediawiki/services/restbase/deploy@master] Revert "Revert "Scap: Add restbase201[3-8] to targets""

https://gerrit.wikimedia.org/r/479182

mobrovac assigned this task to fgiunchedi.

Thank you, @fgiunchedi for setting this up and debugging the problems encountered with EtcD.

Mentioned in SAL (#wikimedia-operations) [2018-12-12T14:07:52Z] <mobrovac@deploy1001> Started restart [restbase/deploy@5946231]: Restart RB to pick up the new seeds in codfw - T211416