Page MenuHomePhabricator

Nodepool no more refresh snapshot images automatically
Closed, ResolvedPublic

Description

Nodepool no more refresh the snapshot images from the base image 'ci-jessie-wikimedia'. The last update was on 2015-11-23 14:16:10.

2015-12-01 14:14:00,030 INFO nodepool.SnapshotImageUpdater: Creating image id: 379 with hostname ci-jessie-wikimedia-1448979240 for ci-jessie-wikimedia in wmflabs-eqiad
2015-12-01 14:15:20,680 ERROR nodepool.SnapshotImageUpdater: Exception updating image ci-jessie-wikimedia in wmflabs-eqiad:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nodepool/nodepool.py", line 900, in _run
    self.updateImage(session)
  File "/usr/lib/python2.7/dist-packages/nodepool/nodepool.py", line 1003, in updateImage
    image_id=image_id, config_drive=self.image.config_drive)
  File "/usr/lib/python2.7/dist-packages/nodepool/provider_manager.py", line 398, in createServer
    return self.submitTask(CreateServerTask(**create_args))
  File "/usr/lib/python2.7/dist-packages/nodepool/task_manager.py", line 119, in submitTask
    return task.wait()
  File "/usr/lib/python2.7/dist-packages/nodepool/task_manager.py", line 57, in run
    self.done(self.main(client))
  File "/usr/lib/python2.7/dist-packages/nodepool/provider_manager.py", line 116, in main
    server = client.servers.create(**self.args)
  File "/usr/lib/python2.7/dist-packages/novaclient/v2/servers.py", line 900, in create
    **boot_kwargs)
  File "/usr/lib/python2.7/dist-packages/novaclient/v2/servers.py", line 523, in _boot
    return_raw=return_raw, **kwargs)
  File "/usr/lib/python2.7/dist-packages/novaclient/base.py", line 161, in _create
    _resp, body = self.api.client.post(url, body=body)
  File "/usr/lib/python2.7/dist-packages/novaclient/client.py", line 453, in post
    return self._cs_request(url, 'POST', **kwargs)
  File "/usr/lib/python2.7/dist-packages/novaclient/client.py", line 428, in _cs_request
    resp, body = self._time_request(url, method, **kwargs)
  File "/usr/lib/python2.7/dist-packages/novaclient/client.py", line 397, in _time_request
    resp, body = self.request(url, method, **kwargs)
  File "/usr/lib/python2.7/dist-packages/novaclient/client.py", line 391, in request
    raise exceptions.from_response(resp, body, url, method)
BadRequest: Can not find requested image (HTTP 400)
2015-12-01 14:15:20,696 INFO nodepool.NodePool: Deleted image id: 379

The Nodepool provider wmflabs-eqiad has:

providers:
  - name: wmflabs-eqiad
    service-type: 'compute'
    service-name: 'nova'
    project-id: 'contintcloud'
    region-name: 'eqiad'
    username: 'nodepoolmanager'
    ...
    images:
      - name: ci-jessie-wikimedia
        # RelEng manually build and upload the image to Glance
        base-image: ci-jessie-wikimedia

The image does show up:

hashar@labnodepool1001:~$ become-nodepool 
nodepool@labnodepool1001:~$ openstack image list --private
+--------------------------------------+--------------------------------+
| ID                                   | Name                           |
+--------------------------------------+--------------------------------+
| 535da6fd-3d87-49b7-8987-044002770dba | ci-jessie-wikimedia-1448296278 |
| 931a1851-5773-4be4-aa5e-c8d01cdb8b52 | ci-jessie-wikimedia            |
| 02e5bace-3da2-4d98-8e4b-f82bd0c1873e | ci-jessie-wikimedia-1448294320 |
+--------------------------------------+--------------------------------+
nodepool@labnodepool1001:~$

Confirming there is no trailing space in the image name

$ openstack image list --private -f yaml
- {ID: !!python/unicode '535da6fd-3d87-49b7-8987-044002770dba', Name: !!python/unicode 'ci-jessie-wikimedia-1448296278'}
- {ID: !!python/unicode '931a1851-5773-4be4-aa5e-c8d01cdb8b52', Name: !!python/unicode 'ci-jessie-wikimedia'}
- {ID: !!python/unicode '02e5bace-3da2-4d98-8e4b-f82bd0c1873e', Name: !!python/unicode 'ci-jessie-wikimedia-1448294320'}
$ nova image-show  ci-jessie-wikimedia
+----------------------+--------------------------------------+
| Property             | Value                                |
+----------------------+--------------------------------------+
| OS-EXT-IMG-SIZE:size | 1126485504                           |
| created              | 2015-11-23T16:30:43Z                 |
| id                   | 931a1851-5773-4be4-aa5e-c8d01cdb8b52 |
| metadata show        | true                                 |
| minDisk              | 0                                    |
| minRam               | 0                                    |
| name                 | ci-jessie-wikimedia                  |
| progress             | 100                                  |
| status               | ACTIVE                               |
| updated              | 2015-11-23T16:30:53Z                 |
+----------------------+--------------------------------------+

Seems I screwed up something last time I created the image? :(

Event Timeline

hashar created this task.Dec 2 2015, 10:52 AM
hashar raised the priority of this task from to Needs Triage.
hashar updated the task description. (Show Details)
hashar added a subscriber: hashar.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptDec 2 2015, 10:52 AM
hashar updated the task description. (Show Details)Dec 2 2015, 10:54 AM
hashar set Security to None.

We can still manually update the snapshot though:

nodepool@labnodepool1001:~$ nodepool image-update wmflabs-eqiad ci-jessie-wikimedia
2015-12-02 10:55:01,622 INFO nodepool.SnapshotImageUpdater: Creating image id: 380 with hostname ci-jessie-wikimedia-1449053701 for ci-jessie-wikimedia in wmflabs-eqiad
...
2015-12-02 10:58:25,786 INFO nodepool.SnapshotImageUpdater: Image ci-jessie-wikimedia-1449053701 in wmflabs-eqiad is ready
$ openstack image list --private
+--------------------------------------+--------------------------------+
| ID                                   | Name                           |
+--------------------------------------+--------------------------------+
| 32860eec-860c-4b01-b6a3-49c5034f527b | ci-jessie-wikimedia-1449053701 |  <-- new snapshot
| 535da6fd-3d87-49b7-8987-044002770dba | ci-jessie-wikimedia-1448296278 |
| 931a1851-5773-4be4-aa5e-c8d01cdb8b52 | ci-jessie-wikimedia            |
+--------------------------------------+--------------------------------+

Restarted nodepool process on labnodepool1001.eqiad.wmnet

hashar renamed this task from Nodepool to Nodepool no more refresh snapshot images automatically.Dec 10 2015, 9:46 AM
hashar closed this task as Resolved.Dec 10 2015, 9:50 AM
hashar claimed this task.

It created one properly on Dec 0th at 14:00 UTC and deleted the old one (48 hours age)

2015-12-09 14:14:00,027 INFO nodepool.SnapshotImageUpdater: Creating image id: 388 with hostname ci-jessie-wikimedia-1449670440 for ci-jessie-wikimedia in wmflabs-eqiad

2015-12-09 14:16:51,962 INFO nodepool.SnapshotImageUpdater: Image ci-jessie-wikimedia-1449670440 in wmflabs-eqiad is ready
2015-12-09 14:17:00,041 INFO nodepool.NodePool: Deleting image id: 386 which is 47.9952888511 hours old

Not sure what happened.