OpenStack API refuses to launch new instances || Nodepool is out of instance / CI stalled
Closed, ResolvedPublic

Description

OpenStack refuses to spawn instances:

<class 'nova.exception.ImageNotAuthorized'> (HTTP 500) (Request-ID: req-1c941387-2c17-4c90-90a3-733bd4d78887)

Server side reports:

ImageNotAuthorized: Not authorized for image 84f2fcfb-7ac5-4c3b-9505-ada37cbcaebf.

First occurence was at 2016-12-01 13:30:11,381

Trace from Nodepool:

2016-12-01 13:47:21,031 ERROR nodepool.NodeLauncher: ClientException launching node id: 436121 in provider: wmflabs-eqiad error:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nodepool/nodepool.py", line 337, in _run
    dt = self.launchNode(session)
  File "/usr/lib/python2.7/dist-packages/nodepool/nodepool.py", line 392, in launchNode
    config_drive=self.image.config_drive)
  File "/usr/lib/python2.7/dist-packages/nodepool/provider_manager.py", line 398, in createServer
    return self.submitTask(CreateServerTask(**create_args))
  File "/usr/lib/python2.7/dist-packages/nodepool/task_manager.py", line 119, in submitTask
    return task.wait()
  File "/usr/lib/python2.7/dist-packages/nodepool/task_manager.py", line 57, in run
    self.done(self.main(client))
  File "/usr/lib/python2.7/dist-packages/nodepool/provider_manager.py", line 116, in main
    server = client.servers.create(**self.args)
  File "/usr/lib/python2.7/dist-packages/novaclient/v2/servers.py", line 900, in create
    **boot_kwargs)
  File "/usr/lib/python2.7/dist-packages/novaclient/v2/servers.py", line 523, in _boot
    return_raw=return_raw, **kwargs)
  File "/usr/lib/python2.7/dist-packages/novaclient/base.py", line 161, in _create
    _resp, body = self.api.client.post(url, body=body)
  File "/usr/lib/python2.7/dist-packages/novaclient/client.py", line 453, in post
    return self._cs_request(url, 'POST', **kwargs)
  File "/usr/lib/python2.7/dist-packages/novaclient/client.py", line 428, in _cs_request
    resp, body = self._time_request(url, method, **kwargs)
  File "/usr/lib/python2.7/dist-packages/novaclient/client.py", line 397, in _time_request
    resp, body = self.request(url, method, **kwargs)
  File "/usr/lib/python2.7/dist-packages/novaclient/client.py", line 391, in request
    raise exceptions.from_response(resp, body, url, method)
ClientException: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'nova.exception.ImageNotAuthorized'> (HTTP 500) (Request-ID: req-1c941387-2c17-4c90-90a3-733bd4d78887)

Server side (labnet1001):

2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions [req-768103e2-82e0-4694-973a-9589cf23ae27 nodepoolmanager contintcloud - - -] Unexpected exception in API method
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions Traceback (most recent call last):
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/api/openstack/extensions.py", line 478, in wrapped
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     return f(*args, **kwargs)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/api/validation/__init__.py", line 73, in wrapper
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     return func(*args, **kwargs)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/api/validation/__init__.py", line 73, in wrapper
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     return func(*args, **kwargs)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/api/openstack/compute/servers.py", line 611, in create
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     **create_kwargs)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/hooks.py", line 149, in inner
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     rv = f(*args, **kwargs)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 1587, in create
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     check_server_group_quota=check_server_group_quota)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 1164, in _create_instance
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     image_id, boot_meta = self._get_image(context, image_href)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 868, in _get_image
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     image = self.image_api.get(context, image_href)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/image/api.py", line 93, in get
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     show_deleted=show_deleted)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/image/glance.py", line 317, in show
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     _reraise_translated_image_exception(image_id)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/image/glance.py", line 613, in _reraise_translated_image_exception
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     six.reraise(new_exc, None, exc_trace)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/image/glance.py", line 315, in show
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     image = self._client.call(context, version, 'get', image_id)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/nova/image/glance.py", line 231, in call
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     result = getattr(client.images, method)(*args, **kwargs)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/glanceclient/v1/images.py", line 132, in get
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     % urlparse.quote(str(image_id)))
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 277, in head
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     return self._request('HEAD', url, **kwargs)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 272, in _request
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     resp, body_iter = self._handle_response(resp)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 93, in _handle_response
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions     raise exc.from_response(resp, resp.content)
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions ImageNotAuthorized: Not authorized for image 84f2fcfb-7ac5-4c3b-9505-ada37cbcaebf.
2016-12-01 13:54:21.038 5748 ERROR nova.api.openstack.extensions 
2016-12-01 13:54:21.041 5748 INFO nova.api.openstack.wsgi [req-768103e2-82e0-4694-973a-9589cf23ae27 nodepoolmanager contintcloud - - -] HTTP exception thrown: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'nova.exception.ImageNotAuthorized'>
2016-12-01 13:54:21.042 5748 INFO nova.osapi_compute.wsgi.server [req-768103e2-82e0-4694-973a-9589cf23ae27 nodepoolmanager contintcloud - - -] 10.64.20.18 "POST /v2/contintcloud/servers HTTP/1.1" status: 500 len: 438 time: 0.0401890
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 1 2016, 1:50 PM
hashar updated the task description. (Show Details)Dec 1 2016, 1:56 PM

The ImageNotAuthorized reference the images:

84f2fcfb-7ac5-4c3b-9505-ada37cbcaebf
a407dcf3-315b-4b9c-b186-f19048637a83

The contint cloud tenant has:

IDName
84f2fcfb-7ac5-4c3b-9505-ada37cbcaebfci-jessie-wikimedia-1480515240
a407dcf3-315b-4b9c-b186-f19048637a83ci-trusty-wikimedia-1480515242
88bb2538-c2dc-4066-8273-11d0f57d4d7bci-jessie-wikimedia-1480432368
3dd2fd9d-7052-4f2f-8d10-e8d42e3944caci-trusty-wikimedia-1480429423
484fdfbe-2523-4602-b524-4be46db9a396ci-trusty-wikimedia
b3727f73-39c5-4e15-a7a9-d1d027aa7f1eci-jessie-wikimedia
b8d56ff5-c51e-4c5a-9a8c-64e862f60feaci-trusty-wikimedia_old_20161128

Eg it is not allowed to boot an instance from a snapshot image :(

Spawning an instance from Horizon as novaadmin works fine (tested by Andrew).

Tried with the nodepool credentials and it works as well:

nodepool@labnodepool1001:~$ openstack server create --image ci-jessie-wikimedia-1480515240 --flavor m1.medium hashar-spawn

I have forced the refresh of the jessie snapshot and it spawned all fine.

So looks like something is off in Nodepool itself :(

Mentioned in SAL (#wikimedia-releng) [2016-12-01T14:16:24Z] <hashar> Image ci-jessie-wikimedia-1480601060 in wmflabs-eqiad is ready | T152096

The new Jessie snapshot has ID 0d29c97d-390b-439b-9778-6c2171a7020b and fails as well.

$ /usr/sbin/service nodepool status
● nodepool.service - Nodepool daemon
   Loaded: loaded (/lib/systemd/system/nodepool.service; static)
   Active: active (running) since Mon 2016-11-07 13:30:51 UTC; 3 weeks 3 days ago
 Main PID: 1865 (nodepoold)
   CGroup: /system.slice/nodepool.service
           └─1865 /usr/bin/python /usr/bin/nodepoold -d -l /etc/nodepool/logging.conf
chasemp closed this task as Resolved.Dec 1 2016, 2:57 PM
chasemp claimed this task.
chasemp added a subscriber: chasemp.

Somehow nodepool was using an invalid token. We are digging in to see why glance is the place this surfaced but that is the deal.

relevant error from glance logs

> 2016-12-01 14:40:02.334 2223 WARNING keystonemiddleware.auth_token [-] Authorization failed for token

This is the last instance of this error seen. I stopped and then started nodepool and it seems to have refreshed tokens and is currently building its stable of instances.