Page MenuHomePhabricator

cloudvirt2003-dev: neutron-linuxbridge-agent and nova-compute issues
Closed, ResolvedPublic

Description

This server lost its linux bridges, and both neutron-linuxbridge-agent and nova-compute fail to operate.

Creating the bridges is something we do via puppet, and puppet is not failing in any way in the server.

I think I will reimage the server just to make sure our puppet code is sane.

Event Timeline

Phamhi moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.

Script wmf-auto-reimage was launched by aborrero on cumin2001.codfw.wmnet for hosts:

cloudvirt2003-dev.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202103100907_aborrero_10408_cloudvirt2003-dev_codfw_wmnet.log.

Mentioned in SAL (#wikimedia-cloud) [2021-03-10T09:07:54Z] <arturo> [codfw1dev] reimaging cloudvirt2003-dev (T276964)

Completed auto-reimage of hosts:

['cloudvirt2003-dev.codfw.wmnet']

Of which those FAILED:

['cloudvirt2003-dev.codfw.wmnet']

The reimage was fine, but both nova-compute and neutron-linuxbridge-agent present issues when contacting the API services.

Mentioned in SAL (#wikimedia-cloud) [2021-03-10T11:53:55Z] <arturo> [codfw1dev] restart nova-conductor in all 3 cloudcontrol servers for T276964

Mentioned in SAL (#wikimedia-cloud) [2021-03-10T11:56:51Z] <arturo> [codfw1dev] restart rabbitmq-server in all 3 cloudcontrol servers for T276964

aborrero renamed this task from cloudvirt2003-dev: missing neutron bridges to cloudvirt2003-dev: neutron-linuxbridge-agent and nova-compute issues.Mar 10 2021, 11:57 AM

This looks to me purely as a rabbitmq issue on codfw1dev.

aborrero added a subscriber: dcaro.

The server works nice now. I just double-checked.

This was probably fixed by @dcaro when upgraded python kombu & evenlet packages on codfw1dev. Thanks!

aborrero claimed this task.