Page MenuHomePhabricator

Webservice stuck in failed restart loop because of corrupt service.manifest
Closed, DeclinedPublic

Description

The webarchivebot webservice got stuck in a bad restart loop because of a corrupt/partial service.manifest.

service.manifest
# This file is used by toollabs infrastructure.
# Please do not edit manually at this time.
backend: gridengine
version: 2
web: generic

This is missing the expected web::extra_args: ... line. error.log was full of traces like:

Traceback (most recent call last):
  File "/usr/bin/webservice-runner", line 27, in <module>
    webservice.run(port)
  File "/usr/lib/python2.7/dist-packages/toollabs/webservice/services/genericwebservice.py", line 18, in run
    os.execv('/bin/sh', ['/bin/sh', '-c', self.extra_args])
TypeError: execv() arg 2 must contain only strings
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/apport_python_hook.py", line 145, in apport_excepthook
    os.O_WRONLY | os.O_CREAT | os.O_EXCL, 0o640), 'wb') as f:
OSError: [Errno 2] No such file or directory: '/var/crash/_usr_bin_webservice-runner.52813.crash'

Original exception was:
Traceback (most recent call last):
  File "/usr/bin/webservice-runner", line 27, in <module>
    webservice.run(port)
  File "/usr/lib/python2.7/dist-packages/toollabs/webservice/services/genericwebservice.py", line 18, in run
    os.execv('/bin/sh', ['/bin/sh', '-c', self.extra_args])
TypeError: execv() arg 2 must contain only strings
Traceback (most recent call last):
  File "/usr/bin/webservice-runner", line 27, in <module>
    webservice.run(port)
  File "/usr/lib/python2.7/dist-packages/toollabs/webservice/services/genericwebservice.py", line 18, in run
    os.execv('/bin/sh', ['/bin/sh', '-c', self.extra_args])
TypeError: execv() arg 2 must contain only strings
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/apport_python_hook.py", line 145, in apport_excepthook
    os.O_WRONLY | os.O_CREAT | os.O_EXCL, 0o640), 'wb') as f:
OSError: [Errno 2] No such file or directory: '/var/crash/_usr_bin_webservice-runner.52813.crash'

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I'm not sure what happened before @Amitie_10g came on irc to ask for help. This may be a case of webservice stop not cleaning up the service.manifest properly. If so it might be at least partially fixed by the pending patch for T163355: webservice stop says service not running but service.manifest not cleared.

I have not seen a reproduction of this bug for several months. Closing for now, but please do reopen if you are working on a tool that runs into this problem again.