Page MenuHomePhabricator

Update Spicerack cookbooks to follow the new class API conventions
Closed, ResolvedPublic

Description

Very recently a new Class API was added to Spicerack: https://doc.wikimedia.org/spicerack/master/introduction.html#class-interface

This means that all our cookbook would need to be updated to follow the new style, and I think this could be a good occasion to code and get familiar with spicerack/cookbooks for @razzi!

Cookbooks to update:

  • sre.aqs.roll-restart
  • sre.cassandra.roll-restart
  • sre.druid.roll-restart-workers
  • sre.hadoop.change-distro-from-cdh
  • sre.hadoop.init-hadoop-workers
  • sre.hadoop.reboot-workers
  • sre.hadoop.roll-restart-masters
  • sre.hadoop.roll-restart-workers
  • sre.hadoop.stop-cluster
  • sre.kafka.roll-restart-brokers
  • sre.kafka.roll-restart-mirror-maker
  • sre.presto.roll-restart-workers
  • sre.zookeeper.roll-restart-zookeeper

This work is not urgent but it could be a good alternative activity in between more ops-focused tasks.

Some examples:

Event Timeline

elukey updated the task description. (Show Details)

Change 648172 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/cookbooks@master] sre.hadoop.stop-cluster.py: move to class API

https://gerrit.wikimedia.org/r/648172

Change 648172 merged by Elukey:
[operations/cookbooks@master] sre.hadoop.stop-cluster.py: move to class API

https://gerrit.wikimedia.org/r/648172

Change 649653 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/cookbooks@master] sre.hadoop.change-distro-from-cdh: move to class API

https://gerrit.wikimedia.org/r/649653

Change 649653 merged by Elukey:
[operations/cookbooks@master] sre.hadoop.change-distro-from-cdh: move to class API

https://gerrit.wikimedia.org/r/649653

Change 656212 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/cookbooks@master] sre.hadoop.reboot-workers: move to the new class API

https://gerrit.wikimedia.org/r/656212

Change 656212 merged by Elukey:
[operations/cookbooks@master] sre.hadoop.reboot-workers: move to the new class API

https://gerrit.wikimedia.org/r/656212

Change 656952 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/cookbooks@master] sre.hadoop.init-hadoop-workers: move to Class API

https://gerrit.wikimedia.org/r/656952

Change 656952 merged by Elukey:
[operations/cookbooks@master] sre.hadoop.init-hadoop-workers: move to Class API

https://gerrit.wikimedia.org/r/656952

Change 657049 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/cookbooks@master] sre.druid.roll-restart-workers: move to class API

https://gerrit.wikimedia.org/r/657049

Change 657049 merged by jenkins-bot:
[operations/cookbooks@master] sre.druid.roll-restart-workers: move to class API

https://gerrit.wikimedia.org/r/657049

Change 663033 had a related patch set uploaded (by Razzi; owner: Razzi):
[operations/cookbooks@master] sre.druid.roll-restart-workers: properly pass commands list

https://gerrit.wikimedia.org/r/663033

Change 663033 merged by Razzi:
[operations/cookbooks@master] sre.druid.roll-restart-workers: properly pass commands list

https://gerrit.wikimedia.org/r/663033

Change 663863 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/cookbooks@master] sre.presto.roll-restart-workers: move to class api

https://gerrit.wikimedia.org/r/663863

Change 663863 merged by jenkins-bot:
[operations/cookbooks@master] sre.presto.roll-restart-workers: move to class api

https://gerrit.wikimedia.org/r/663863

Change 704294 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/cookbooks@master] Update AQS Roll Restart cookbook to use new style

https://gerrit.wikimedia.org/r/704294

@elukey Can I ask, what would you consider to be the best way to test this change locally, before pushing to gerrit?

Change 704294 merged by jenkins-bot:

[operations/cookbooks@master] Update AQS Roll Restart cookbook to use new style

https://gerrit.wikimedia.org/r/704294

My dry-run fails. The error seems to be here:

File "/srv/deployment/spicerack/cookbooks/sre/aqs/roll-restart.py", line 73, in run
  name=r'(?!' + self.aqs_canary.hosts[0] + ').*')

Also I notice that the help text and the actual command needed are different.

The help text says:

Usage example:
        cookbook.sre.aqs.roll_restart --cluster aqs

But when it comes to running it, the cluster argument is positional, therefore the command required is:

cookbook -d sre.aqs.roll-restart aqs

Here is the full traceback.

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/spicerack/_menu.py", line 234, in run
    raw_ret = runner.run()
  File "/srv/deployment/spicerack/cookbooks/sre/aqs/roll-restart.py", line 73, in run
    name=r'(?!' + self.aqs_canary.hosts[0] + ').*')
  File "/usr/lib/python3/dist-packages/spicerack/remote.py", line 318, in query_confctl
    hosts_conftool = [obj.name for obj in conftool.get(**tags)]
  File "/usr/lib/python3/dist-packages/spicerack/remote.py", line 318, in <listcomp>
    hosts_conftool = [obj.name for obj in conftool.get(**tags)]
  File "/usr/lib/python3/dist-packages/spicerack/confctl.py", line 118, in get
    for obj in self._select(tags):
  File "/usr/lib/python3/dist-packages/spicerack/confctl.py", line 82, in _select
    selectors[tag] = re.compile("^{}$".format(expr))
  File "/usr/lib/python3.7/re.py", line 234, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python3.7/re.py", line 286, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/usr/lib/python3.7/sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "/usr/lib/python3.7/sre_parse.py", line 930, in parse
    p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
  File "/usr/lib/python3.7/sre_parse.py", line 426, in _parse_sub
    not nested and not items))
  File "/usr/lib/python3.7/sre_parse.py", line 580, in _parse
    raise source.error(msg, len(this) + 1 + len(that))
re.error: bad character range 4-1 at position 8
DRY-RUN: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99) for AQS aqs[1004-1015].eqiad.wmnet cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001

@BTullis good finding! Can you send a follow up patch to update the argparse settings?

Change 704347 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/cookbooks@master] Fix the sre.aqs.roll-restart cookbook

https://gerrit.wikimedia.org/r/704347

Change 704347 merged by jenkins-bot:

[operations/cookbooks@master] Fix the sre.aqs.roll-restart cookbook

https://gerrit.wikimedia.org/r/704347

Change 704500 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/cookbooks@master] Update sre.hadoop.roll-restart-masters cookbook

https://gerrit.wikimedia.org/r/704500

Change 704501 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/cookbooks@master] Update sre.hadoop.roll-restart-workers cookbook

https://gerrit.wikimedia.org/r/704501

Dry run of sre.aqs.roll-restart was successful this time.

DRY-RUN: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001

Marking that cookbook as complete.

BTullis triaged this task as Medium priority.Jul 14 2021, 9:03 AM
BTullis updated the task description. (Show Details)

Change 704500 merged by jenkins-bot:

[operations/cookbooks@master] Update sre.hadoop.roll-restart-masters cookbook

https://gerrit.wikimedia.org/r/704500

Change 704501 merged by jenkins-bot:

[operations/cookbooks@master] Update sre.hadoop.roll-restart-workers cookbook

https://gerrit.wikimedia.org/r/704501

Dry run succeeded for hadoop masters and workers cookbooks.

Change 704932 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/cookbooks@master] Update sre.kafka.roll-restart cookbooks to new API

https://gerrit.wikimedia.org/r/704932

Change 704937 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/cookbooks@master] Update sre.zookeeper.roll-restart to use new spicerack API

https://gerrit.wikimedia.org/r/704937

Change 704937 merged by jenkins-bot:

[operations/cookbooks@master] Update sre.zookeeper.roll-restart to use new spicerack API

https://gerrit.wikimedia.org/r/704937

Dry-run for sre.zookeeper.roll-restart-zookeeper succeeded.

Change 705869 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/cookbooks@master] Update sre.cassandra.roll-restart cookbook to use new spicerack API

https://gerrit.wikimedia.org/r/705869

Change 705869 merged by jenkins-bot:

[operations/cookbooks@master] Update sre.cassandra.roll-restart cookbook to use new spicerack API

https://gerrit.wikimedia.org/r/705869

Change 704932 merged by Btullis:

[operations/cookbooks@master] Update sre.kafka.roll-restart cookbooks to new API

https://gerrit.wikimedia.org/r/704932

Dry run succeeded on the kafka cookbooks, which were the only outstanding change to be tested.