Page MenuHomePhabricator

cloudcontrol200[13]-dev linux bridge agent errors
Closed, ResolvedPublic

Description

The cloudcontrol hosts in codfw are tripping the icinga "check systemd state" and showing up on the main alerts status page.

cloudcontrol2003-dev:~$ sudo systemctl --failed
  UNIT                              LOAD   ACTIVE SUB    DESCRIPTION
● neutron-linuxbridge-agent.service loaded failed failed Openstack Neutron LinuxBridge Plugin Agent
Jan 02 14:15:18 cloudcontrol2001-dev systemd[1]: [/lib/systemd/system/neutron-linuxbridge-agent.service:12] Runtime directory is not valid, ignoring assignment: neutron lock/neutron
Jan 02 14:15:18 cloudcontrol2001-dev systemd[1]: [/lib/systemd/system/neutron-linuxbridge-agent.service:13] Unknown lvalue 'CacheDirectory' in section 'Service'

The unit configuration for that service looks a little strange:

cloudcontrol2001-dev:~$ sudo systemctl cat neutron-linuxbridge-agent                                                                                                                                                                               [15/169]
# /lib/systemd/system/neutron-linuxbridge-agent.service
[Unit]
Description=Openstack Neutron LinuxBridge Plugin Agent
After=mysql.service postgresql.service rabbitmq-server.service keystone.service

[Service]
User=neutron
Group=neutron
Type=simple
WorkingDirectory=~
RuntimeDirectory=neutron lock/neutron
CacheDirectory=neutron
ExecStart=/etc/init.d/neutron-linuxbridge-agent systemd-start
Restart=on-failure
LimitNOFILE=65535
TimeoutStopSec=15

[Install]
WantedBy=multi-user.target

I'm not sure why that service is on these systems, it might be related to the upgrade work in T241348

Event Timeline

JHedden created this task.Fri, Jan 10, 8:43 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFri, Jan 10, 8:43 PM

Mentioned in SAL (#wikimedia-operations) [2020-01-10T20:45:18Z] <jeh> cloudcontrol200[13]-dev schedule downtime until Feb 28 2020 on systemd service check T242462

JHedden triaged this task as Medium priority.Fri, Jan 10, 8:47 PM
JHedden updated the task description. (Show Details)
Andrew closed this task as Resolved.Tue, Jan 14, 3:49 AM
Andrew claimed this task.
Andrew added a subscriber: Andrew.

I removed that package, and puppet didn't re-install it. So it probably showed up due to an overly-expansive apt-get line. I also rebooted one of the cloudcontrol hosts for good measure and everything looks fine now.