To avoid a monitoring outage during the maintenance in the parent task we need to fail services on alert1001 over to alert2001 before maintenance begins.
Description
Description
Details
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T253824 planned upstream deprecation of the ssh-rsa signing algorithm (RSA with SHA-1) | |||
Resolved | ayounsi | T254013 all network devices must run OpenSSH >= 7.2p1 but != 7.4p1 | |||
Resolved | ayounsi | T317175 Junos: resolve DNS through mgmt_junos | |||
Resolved | ayounsi | T327862 Use mgmt_junos on all network devices | |||
Restricted Task | |||||
Open | None | T316539 Upgrade network devices to Junos 20+ | |||
Resolved | ayounsi | T327248 eqiad/codfw virtual-chassis upgrades | |||
Resolved | Clement_Goubert | T327920 March 2023 Datacenter Switchover | |||
Resolved | ayounsi | T331882 eqiad row C switches upgrade | |||
Resolved | herron | T333478 failover alert1001 to alert2001 | |||
Resolved | herron | T333837 failover alert2001 to alert1001 | |||
Declined | herron | T333838 alerting_host: Reduced availability for job icinga-am after failover event | |||
Open | herron | T333855 vopsbot needed manual restart after alerting hosts failover |
Event Timeline
Comment Actions
Change 899629 had a related patch set uploaded (by Herron; author: Herron):
[operations/puppet@production] alerting_host: failover icinga and alertmanger from eqiad to codfw
Comment Actions
Change 904614 had a related patch set uploaded (by Herron; author: Herron):
[operations/dns@master] dns: repoint alert host services to alert2001
Comment Actions
Change 899629 merged by Herron:
[operations/puppet@production] alerting_host: failover icinga and alertmanger from eqiad to codfw
Comment Actions
Change 904614 merged by Herron:
[operations/dns@master] dns: repoint alert host services to alert2001