Page MenuHomePhabricator

Cookbook sre.gitlab.upgrade fails when unpausing runners
Closed, ResolvedPublic

Description

Runs of sre.gitlab.upgrade fail all the time when unpausing the runners.

Reduce tries from 20 to 1 in DRY-RUN mode                                                                                                                    
Exception raised while executing cookbook sre.gitlab.upgrade:                                                                                                
Traceback (most recent call last):                                                                                                                             File "/usr/lib/python3/dist-packages/gitlab/exceptions.py", line 279, in wrapped_f                                                                         
    return f(*args, **kwargs)                                                                                                                                
  File "/usr/lib/python3/dist-packages/gitlab/mixins.py", line 296, in update                                                                                
    return http_method(path, post_data=new_data, files=files, **kwargs)                                                                                      
  File "/usr/lib/python3/dist-packages/gitlab/__init__.py", line 713, in http_put                                                                            
    result = self.http_request(                                                                                                                              
  File "/usr/lib/python3/dist-packages/gitlab/__init__.py", line 565, in http_request                                                                        
    raise GitlabHttpError(                                                                                                                                   
gitlab.exceptions.GitlabHttpError: 502: GitLab is not responding                                                                                             
                                                                                                                                                             
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/spicerack/_menu.py", line 212, in run
    raw_ret = runner.run()
  File "/srv/deployment/spicerack/cookbooks/sre/gitlab/upgrade.py", line 122, in run
    unpause_runners(paused_runners, dry_run=self.spicerack.dry_run)
  File "/usr/lib/python3/dist-packages/wmflib/decorators.py", line 210, in wrapper
    return func(*args, **kwargs)
  File "/srv/deployment/spicerack/cookbooks/sre/gitlab/__init__.py", line 64, in unpause_runners
    runner.save()
  File "/usr/lib/python3/dist-packages/gitlab/mixins.py", line 385, in save
    server_data = self.manager.update(obj_id, updated_data, **kwargs)
  File "/usr/lib/python3/dist-packages/gitlab/exceptions.py", line 281, in wrapped_f
    raise error(e.error_message, e.response_code, e.response_body) from e
gitlab.exceptions.GitlabUpdateError: 502: GitLab is not responding
END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab2002.wikimedia.org with reason: Install software version upgrade

Somehow the additional logic for handling the dry run flag (implemented somewhere in https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/909257) reduces the number of retries to 1 instead of 20. This fails because the GitLab API needs some time/retries until it is functional again.
The methods pause_runners() and unpause_runners() are actually trying to pause and unpause the runners, despite the message regarding DRY-RUN mode.

Event Timeline

Change 914923 had a related patch set uploaded (by EoghanGaffney; author: EoghanGaffney):

[operations/software/spicerack@master] [spicerack/decorators] Don't miss dry_run if it's disabled in kwargs

https://gerrit.wikimedia.org/r/914923

Change 915434 had a related patch set uploaded (by Volans; author: Volans):

[operations/software/spicerack@master] decorators: fix dry_run detection

https://gerrit.wikimedia.org/r/915434

Change 914923 abandoned by EoghanGaffney:

[operations/software/spicerack@master] [spicerack/decorators] Don't miss dry_run if it's disabled in kwargs

Reason:

https://gerrit.wikimedia.org/r/914923

eoghan changed the task status from Open to In Progress.May 5 2023, 3:42 PM
eoghan claimed this task.
eoghan moved this task from Incoming to Work in Progress on the collaboration-services board.

Change 915434 merged by jenkins-bot:

[operations/software/spicerack@master] decorators: fix dry_run detection

https://gerrit.wikimedia.org/r/915434

Jelto added a subscriber: Volans.

This is solved, cookbook uses correct number of retries now. Thanks for the fix @Volans and @eoghan !