Per parent task, our cloud network resilience could be improved by adding automated tests & more monitoring.
Idea here: P15659
Per parent task, our cloud network resilience could be improved by adding automated tests & more monitoring.
Idea here: P15659
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | nskaggs | T294853 2021-11-02 Cloud VPS network outage | |||
Resolved | aborrero | T294955 cloud network: improve automated testing & monitoring | |||
Resolved | aborrero | T294956 Avoid unnecessary keepalived flap after rebooting servers | |||
Resolved | ayounsi | T295288 cr-codfw: set up static route for 185.15.57.8/30 |
Change 736819 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] cloud: introduce network tests
Mentioned in SAL (#wikimedia-cloud) [2021-11-08T10:34:35Z] <arturo> create service account srv-networktests following https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Service_accounts for T294955
Mentioned in SAL (#wikimedia-cloud) [2021-11-08T10:54:38Z] <arturo> [codfw1dev] create service account srv-networktests following https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Service_accounts for T294955
Change 737346 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[labs/private@master] secret: add openstack networktests sshkeys placeholders
Change 737346 merged by Arturo Borrero Gonzalez:
[labs/private@master] secret: add openstack networktests sshkeys placeholders
Change 736819 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] cloud: introduce network tests
Change 737392 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: networktests: correct some problems
Change 737392 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: networktests: correct some problems
Change 737418 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] cloud: networktests: add missing ROUTING_SOURCE_IP envvar
Change 737418 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] cloud: networktests: add missing ROUTING_SOURCE_IP envvar
Change 737613 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] cloud: networktests: runner: use expanded_cmd
Change 737614 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] cloud: networktests: ssh: use -q
Change 737613 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] cloud: networktests: runner: use expanded_cmd
Change 737614 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] cloud: networktests: ssh: use -q
Change 737615 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] cloud: networktests: use -q
Change 737615 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] cloud: networktests: use -q
Change 737620 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] cloud: networktests: fix some testcases
Change 737620 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] cloud: networktests: fix some testcases
Change 737642 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] cloud: networktests: rework some of the raw icmp checks
Change 737642 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] cloud: networktests: rework some of the raw icmp checks
Change 737648 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/dns@master] wikimediacloud.org: add A records for cloudgw2001-dev/2002-dev
Change 737648 merged by Arturo Borrero Gonzalez:
[operations/dns@master] wikimediacloud.org: add A records for cloudgw2001-dev/2002-dev
Change 737667 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/cookbooks@wmcs] wmcs: add openstack network tests cookbook
Change 737741 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: monitor: cmd-checklist-runner: exit with a different return code
Change 737741 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: monitor: cmd-checklist-runner: exit with a different return code
Change 737667 merged by jenkins-bot:
[operations/cookbooks@wmcs] wmcs: add openstack network tests cookbook
Change 738068 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: networktests: add systemd timer job to run the test suite
Change 738070 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: networktests: run as dedicated user
Change 738070 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: networktests: run as dedicated user
Change 738068 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: networktests: add systemd timer job to run the test suite
Change 738180 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: networktests: fix timer job interval specification
Change 738180 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: networktests: fix timer job interval specification
Change 738191 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: networktests: introduce in eqiad1
Change 738191 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: networktests: introduce in eqiad1
Mentioned in SAL (#wikimedia-cloud) [2021-11-11T10:47:09Z] <arturo> add user srv-networktests as project user (T294955)
Mentioned in SAL (#wikimedia-cloud) [2021-11-11T10:50:28Z] <arturo> add user srv-networktests as project user (T294955)
Mentioned in SAL (#wikimedia-cloud) [2021-11-11T10:50:53Z] <arturo> add user srv-networktests as project user (T294955)
Change 738211 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: networktests: update eqiad1 bastion
Change 738211 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: networktests: update eqiad1 bastion
Change 738212 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: networktests: replace curl silent argument
Change 738212 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: networktests: replace curl silent argument
Change 738214 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: networktests: fix toolforge.org IP address
Change 738214 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: networktests: fix toolforge.org IP address
Got to a nice stopping point.
TODO: