The neutron database for the eqiad1 deployment is running on cloudcontrol1003.wikimedia.org (local mysql daemon). We would need this database to be moved to m5-master before eqiad1 goes into full production.
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
cloudvps: eqiad1: move neutron db to m5-master | operations/puppet | production | +3 -1 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T53494 Use Beta cluster as a true canary for code deployments (epic) | |||
Open | None | T87220 Minimize infrastructure differences between Beta Cluster and production | |||
Open | None | T196662 Set up LVS in beta like prod | |||
Resolved | bd808 | T166396 Program 1 Outcome 4: VPS hosting | |||
Resolved | None | T167293 Nova-network to Neutron migration | |||
Resolved | aborrero | T202261 cloudvps: eqiad1: move neutron db to m5-master |
Event Timeline
Change 453987 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] cloudvps: eqiad1: move neutron db to m5-master
Change 453987 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] cloudvps: eqiad1: move neutron db to m5-master
Mentioned in SAL (#wikimedia-operations) [2018-08-20T11:01:52Z] <arturo> T202261 disabled puppet in cloudcontrol1003.wikimedia.org, cloudcontrol1004.wikimedia.org, clounet1003.eqiad.wmnet, cloudnet1004.eqiad.wmnet
Created DB grants on m5-master.eqiad.wmnet:
GRANT ALL PRIVILEGES ON neutron.* TO 'neutron'@'%' IDENTIFIED BY 'xxxxxxxxxx';
And tested connectivity from all 4 affected hosts:
aborrero@cloudcontrol1004:~ $ mysql -h m5-master.eqiad.wmnet neutron -u neutron -e 'SHOW TABLES' -p Enter password: +-----------------------------------------+ | Tables_in_neutron | +-----------------------------------------+ [...] aborrero@cloudcontrol1003:~ 2s 1 $ mysql -h m5-master.eqiad.wmnet neutron -u neutron -e 'SHOW TABLES' -p Enter password: +-----------------------------------------+ | Tables_in_neutron | +-----------------------------------------+ [...] aborrero@cloudnet1003:~ $ mysql -h m5-master.eqiad.wmnet neutron -u neutron -e 'SHOW TABLES' -p Enter password: +-----------------------------------------+ | Tables_in_neutron | +-----------------------------------------+ [...] aborrero@cloudnet1004:~ $ mysql -h m5-master.eqiad.wmnet neutron -u neutron -e 'SHOW TABLES' -p Enter password: +-----------------------------------------+ | Tables_in_neutron | +-----------------------------------------+ [...]
Mentioned in SAL (#wikimedia-operations) [2018-08-20T11:25:12Z] <arturo> T202261 icinga downtime 1h for cloudcontrol1003.wikimedia.org, cloudcontrol1004.wikimedia.org, clounet1003.eqiad.wmnet, cloudnet1004.eqiad.wmnet previous to patch merge
The neutron-server <--> neutron-XXX-agent connection is sometimes unreliable when it comes to the initial synchronizations.
I had to restart agents and server a couple of times until they can see each other.
What I did last was to restart the server, without restarting the agents, and the the state was good shortly after that:
root@cloudcontrol1003:~# neutron agent-list +--------------------------------------+--------------------+---------------+-------------------+-------+----------------+---------------------------+ | id | agent_type | host | availability_zone | alive | admin_state_up | binary | +--------------------------------------+--------------------+---------------+-------------------+-------+----------------+---------------------------+ | 468aef2a-8eb6-4382-abba-bc284efd9fa5 | DHCP agent | cloudnet1004 | nova | :-) | True | neutron-dhcp-agent | | 601bef99-b53c-4e6a-b384-65d1feebedff | Metadata agent | cloudnet1003 | | :-) | True | neutron-metadata-agent | | 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b | L3 agent | cloudnet1003 | nova | :-) | True | neutron-l3-agent | | 970df1d1-505d-47a4-8d35-1b13c0dfe098 | L3 agent | cloudnet1004 | nova | :-) | True | neutron-l3-agent | | 9f8833de-11a4-4395-8da5-f57fe8326659 | Linux bridge agent | cloudnet1003 | | :-) | True | neutron-linuxbridge-agent | | ad3461d7-b79e-4279-921d-5a476e296767 | Linux bridge agent | cloudnet1004 | | :-) | True | neutron-linuxbridge-agent | | b2f9da63-2f16-4aa5-9400-ae708a733f91 | Linux bridge agent | cloudvirt1021 | | :-) | True | neutron-linuxbridge-agent | | d475e07d-52b3-476e-9a4f-e63b21e1075e | Metadata agent | cloudnet1004 | | :-) | True | neutron-metadata-agent | | e382a233-e6a0-422e-9d2e-5651082783fc | Linux bridge agent | cloudvirt1022 | | :-) | True | neutron-linuxbridge-agent | | ff2a8228-3748-4588-927b-4b6563da9ca0 | DHCP agent | cloudnet1003 | nova | :-) | True | neutron-dhcp-agent | +--------------------------------------+--------------------+---------------+-------------------+-------+----------------+---------------------------+
Mentioned in SAL (#wikimedia-operations) [2018-08-20T12:01:31Z] <arturo> T202261 extend icinga downtime 1D for cloudcontrol1003.wikimedia.org, cloudcontrol1004.wikimedia.org, clounet1003.eqiad.wmnet, cloudnet1004.eqiad.wmnet neutron not properly syncing with agents