Page MenuHomePhabricator

Scap error when deploying kartotherian
Closed, ResolvedPublic

Description

On our last deployment for kartotherian to maps cluster scap throws this error:

jgiannelos@deploy1002:/srv/deployment/kartotherian/deploy$ scap deploy-log
-- Opening log file: '/srv/deployment/kartotherian/deploy/scap/log/scap-sync-2021-09-28-0009.log'
18:18:02 [deploy1002] Started deploy [kartotherian/deploy@0a38bc5]
18:18:02 [deploy1002] Deploying Rev: scap/sync/2021-09-28/0009^{} = 04d2df495685f09f8b931840ff438b37c4ab3257
18:18:02 [deploy1002] Started deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin
18:18:02 [deploy1002] 
== DEFAULT ==
:* maps1010.eqiad.wmnet
18:18:03 [maps1010.eqiad.wmnet] Unhandled error:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/scap/cli.py", line 352, in run
    exit_status = app.main(app.extra_arguments)
  File "/usr/lib/python3/dist-packages/scap/deploy.py", line 157, in main
    getattr(self, stage)()
  File "/usr/lib/python3/dist-packages/scap/deploy.py", line 252, in config_diff
    overrides=overrides,
  File "/usr/lib/python3/dist-packages/scap/template.py", line 87, in __init__
    env_args = self._make_env_args(loader, erb_syntax, output_format)
  File "/usr/lib/python3/dist-packages/scap/template.py", line 95, in _make_env_args
    loader = {n: f.decode("utf-8") for n, f in loader.items()}
  File "/usr/lib/python3/dist-packages/scap/template.py", line 95, in <dictcomp>
    loader = {n: f.decode("utf-8") for n, f in loader.items()}
AttributeError: 'str' object has no attribute 'decode'
18:18:03 [maps1010.eqiad.wmnet] deploy-local failed: <AttributeError> {}
18:18:03 [deploy1002] [u'/usr/bin/scap', u'deploy-local', u'-v', u'--repo', u'kartotherian/deploy', u'-g', u'default', u'config_diff', u'--refresh-config'] on maps1010.eqiad.wmnet returned [70]: Unhandled error:
deploy-local failed: <AttributeError> {}

18:18:03 [deploy1002] 1 targets had deploy errors
18:18:03 [deploy1002] 1 targets failed
18:18:03 [deploy1002] 1 of 1 default targets failed, exceeding limit

At first we thought its one of the variables that we changed but it failed even for a deployment version that was previously working.

Event Timeline

Hi @Jgiannelos. Scap has been rolled back to its prior production version (3.17.1-1) so you should be able to deploy now. I'll work on fixing this bug in the meantime.

Change 724527 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[mediawiki/tools/scap@master] fix: template.py Python 3 fallout

https://gerrit.wikimedia.org/r/724527

I tried to run another scap deployment but i am getting the same error. From what I understand, the problem must be on the maps node side (scap target) where the installed package is scap 4.0.0.
For example on maps1010:

jgiannelos@maps1010:~$ dpkg -l | grep scap
ii  scap                                 4.0.0-1                      all          Deployment toolchain for Wikimedia projects

Should we also rollback the scap target nodes to use the previous version?

Just a heads up, this is currently blocking us from pushing a couple of changes to kartotherian to test our prod environments in k8s which is currently our main task.

Jgiannelos raised the priority of this task from High to Needs Triage.

Just a heads up, this is currently blocking us from pushing a couple of changes to kartotherian to test our prod environments in k8s which is currently our main task.

If we don't have a new scap package by tomorrow, I will downgrade scap on maps* hosts to unblock you. I will ping you on IRC.

jijiki triaged this task as High priority.Sep 29 2021, 11:42 AM
jijiki added a project: serviceops.

Mentioned in SAL (#wikimedia-operations) [2021-09-30T11:44:36Z] <effie> downgrading scap to 3.17.1-1 on maps* hosts - T291990

I just run a deployment in one of the maps nodes and looks like it works with 3.17.1. Thanks @jijiki.

Change 724527 merged by jenkins-bot:

[mediawiki/tools/scap@master] fix: template.py Python 3 fallout

https://gerrit.wikimedia.org/r/724527

dancy claimed this task.

@Jgiannelos Scap 4.0.2 has been deployed everywhere. It has fixes for the problem you reported in this ticket. Let me know if you have any further issues.