Page MenuHomePhabricator

eqiad1 VMs can no longer contact the nova metadata service
Closed, ResolvedPublic

Description

For example, here's a VM asking the metadata service about its hostname:

$ curl http://169.254.169.254/latest/meta-data/hostname
<html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>

I'm guessing this has to do with recent firewall changes, for example

https://gerrit.wikimedia.org/r/c/operations/puppet/+/868070

or

https://gerrit.wikimedia.org/r/c/operations/puppet/+/883571

This means we can't create new working VMs at all. This is likely also breaking Magnum and Heat. Trove seems to still work, I think because it uses config-drive rather than the metadata service.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Andrew triaged this task as Unbreak Now! priority.Jan 26 2023, 12:58 AM
Andrew updated the task description. (Show Details)

Change 883696 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] fix nova-metadata firewall rules

https://gerrit.wikimedia.org/r/883696

Change 883696 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] fix nova-metadata firewall rules

https://gerrit.wikimedia.org/r/883696

aborrero closed this task as Resolved.EditedJan 26 2023, 9:11 AM

Works now, thanks @taavi for the quick patch.

aborrero@tools-sgebastion-11:~$ curl http://169.254.169.254/latest/meta-data/hostname
tools-sgebastion-11.novalocal

Mentioned in SAL (#wikimedia-cloud) [2023-01-26T09:26:23Z] <taavi> deleting instances leaked by T327980