Page MenuHomePhabricator

spicerack.decorators.retry: dynamic_params_callbacks=(set_tries,) dosn't seem to work as expected
Closed, ResolvedPublic

Description

The spicerack.service.Service.check_dns_state is decorated wit the retry method as follows:

@retry(backoff_mode="constant", exceptions=(DiscoveryStateError,), dynamic_params_callbacks=(set_tries,))
def check_dns_state(self, ip_per_dc_map: dict[str, Union[IPv4Address, IPv6Address]], record_name: str = "", tries: int = 15) -> None:
    """
    Arguments:
             tries: the number of retries to attempt before failing.
   """

This suggests to me that the function should, in the default instance, retry three times . however when using this function in the SREDiscoveryNoLVSBatchRunnerBase cookbooks where we call the function via self.service.check_dns_state(self.ip_per_dc_map) we see that it only retries three times

 Setting pooled=False for tags: {'dnsdisc': '(pki)', 'name': 'codfw'}
Sleeping for 15 seconds
[1/3, retrying in 3.00s] Attempt to run 'spicerack.service.Service.check_dns_state' raised: Error checking auth dns for pki.discovery.wmnet in codfw: resolved to 10.192.16.172, a different IP.
[2/3, retrying in 3.00s] Attempt to run 'spicerack.service.Service.check_dns_state' raised: Error checking auth dns for pki.discovery.wmnet in codfw: resolved to 10.192.16.172, a different IP.
Exception raised while executing cookbook sre.pki.restart-reboot:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/spicerack/_menu.py", line 212, in run
    raw_ret = runner.run()
  File "/srv/deployment/spicerack/cookbooks/sre/__init__.py", line 666, in run
    report = super().run()
  File "/srv/deployment/spicerack/cookbooks/sre/__init__.py", line 451, in run
    return self.batch_action()
  File "/srv/deployment/spicerack/cookbooks/sre/__init__.py", line 424, in batch_action
    self.group_action(host_group_idx, number_of_batches)
  File "/srv/deployment/spicerack/cookbooks/sre/__init__.py", line 657, in group_action
    self.service.check_dns_state(self.ip_per_dc_map)
  File "/usr/lib/python3/dist-packages/wmflib/decorators.py", line 210, in wrapper
    return func(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/spicerack/service.py", line 512, in check_dns_state
    self.discovery.get(record_name).check_service_ips(self.ip, ip_per_dc_map)
  File "/usr/lib/python3/dist-packages/spicerack/service.py", line 115, in check_service_ips
    raise DiscoveryStateError(
spicerack.service.DiscoveryStateError: Error checking auth dns for pki.discovery.wmnet in codfw: resolved to 10.192.16.172, a different IP was expected.

Event Timeline

jbond triaged this task as Medium priority.Sep 12 2023, 9:54 AM
jbond created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Volans renamed this task from spicerrack.decorators.retry: dynamic_params_callbacks=(set_tries,) dfosn;t seem to work as epected to spicerack.decorators.retry: dynamic_params_callbacks=(set_tries,) dosn't seem to work as expected.Sep 12 2023, 9:55 AM
Volans subscribed.

Yes the issue is that the set_tries defined in spicerack doesn't check the function signature for default values and the @retry is not setting the tries parameter.
I'm sending a patch to fix it in spicerack so that it will use the default value if set and no explicit argument is passed.

Change 956801 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/cookbooks@master] SREDiscoveryNoLVSBatchRunnerBase: Bump waiting interval to 30s

https://gerrit.wikimedia.org/r/956801

Change 956801 merged by Muehlenhoff:

[operations/cookbooks@master] SREDiscoveryNoLVSBatchRunnerBase: Bump waiting interval to 30s

https://gerrit.wikimedia.org/r/956801

Change 956972 had a related patch set uploaded (by Volans; author: Volans):

[operations/software/spicerack@master] decorators: fix set_tries

https://gerrit.wikimedia.org/r/956972

Change 956972 merged by jenkins-bot:

[operations/software/spicerack@master] decorators: fix set_tries

https://gerrit.wikimedia.org/r/956972

The change has been merged and released with Spicerack v7.3.0 on Oct. 4th. Resolving.