- Add TLS support to the deployment chart
- Enable TLS on k8s in production
- Add Additional LVS endpoint configuration
- Switch services to use the TLS LVS
- Remove non-TLS LVS endpoint configuration
- Remove the non-TLS k8s service
- Remove proton VMs
- Remove all proton puppet configuration related to the old, non k8s, infra
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | JMeybohm | T235411 Add TLS termination to services running on kubernetes | |||
Resolved | JMeybohm | T255877 Move proton to use TLS only |
Event Timeline
Change 607536 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] proton: Switch restbase production to TLS
Change 607536 abandoned by Alexandros Kosiaris:
[operations/puppet@production] proton: Switch restbase production to TLS
Reason:
Done differently in https://gerrit.wikimedia.org/r/c/operations/puppet/ /610720
Change 610789 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] proton: Set LVS level OpenAPI checks on TLS
Change 610789 merged by Alexandros Kosiaris:
[operations/puppet@production] proton: Set LVS level OpenAPI checks on TLS
Change 610855 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/deployment-charts@master] proton: Amend prometheus-statsd config
Change 610855 merged by jenkins-bot:
[operations/deployment-charts@master] proton: Amend prometheus-statsd config
Change 627541 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/puppet@production] lvs: Remove proton non-TLS endpoint from LVS 1/2
Change 627542 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/puppet@production] lvs: Remove proton non-TLS endpoint from LVS 2/2
Change 627857 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] proton-http: stop monitoring the endpoint
Change 627858 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] proton: remove non-https endpoint
Change 627859 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] proton: remove conftool-data
Change 627860 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] proton: remove the ganeti VMs from puppet
Change 627861 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] proton: remove all puppet code, other references to the non-k8s service
Change 627857 merged by Giuseppe Lavagetto:
[operations/puppet@production] proton-http: stop monitoring the endpoint
Change 627541 abandoned by JMeybohm:
[operations/puppet@production] lvs: Remove proton non-TLS endpoint from LVS 1/2
Reason:
Done with https://gerrit.wikimedia.org/r/c/operations/puppet/ /627857
Change 627858 merged by JMeybohm:
[operations/puppet@production] proton: remove non-https endpoint
Mentioned in SAL (#wikimedia-operations) [2020-09-22T14:05:57Z] <jayme> running puppet on lvs servers - T255868 T255877
Mentioned in SAL (#wikimedia-operations) [2020-09-22T14:09:15Z] <jayme> restarting pybal on lvs1016.eqiad.wmnet,lvs2010.codfw.wmnet - T255868 T255877
Mentioned in SAL (#wikimedia-operations) [2020-09-22T14:11:21Z] <jayme> restarting pybal on lvs1015.eqiad.wmnet,lvs2009.codfw.wmnet - T255868 T255877
Mentioned in SAL (#wikimedia-operations) [2020-09-22T14:12:01Z] <jayme> running ipvsadm -D -t 10.2.2.19:1970; ipvsadm -D -t 10.2.2.21:24766 on lvs1016.eqiad.wmnet,lvs1015.eqiad.wmnet - T255868 T255877
Mentioned in SAL (#wikimedia-operations) [2020-09-22T14:12:40Z] <jayme> running ipvsadm -D -t 10.2.1.19:1970; ipvsadm -D -t 10.2.1.21:24766 on lvs2010.codfw.wmnet,lvs2009.codfw.wmnet - T255868 T255877
Change 627542 abandoned by JMeybohm:
[operations/puppet@production] lvs: Remove proton non-TLS endpoint from LVS 2/2
Reason:
done with https://gerrit.wikimedia.org/r/c/operations/puppet/ /627858
Change 627859 merged by Alexandros Kosiaris:
[operations/puppet@production] proton: remove conftool-data
Change 627860 merged by Alexandros Kosiaris:
[operations/puppet@production] proton: remove the ganeti VMs from puppet
cookbooks.sre.hosts.decommission executed by akosiaris@cumin1001 for hosts: proton1001.eqiad.wmnet
- proton1001.eqiad.wmnet (PASS)
- Downtimed host on Icinga
- Found Ganeti VM
- VM shutdown
- Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox
- Removed from DebMonitor
- Removed from Puppet master and PuppetDB
- VM removed
- Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox
- COMMON_STEPS (FAIL)
- Failed to run the sre.dns.netbox cookbook: Cumin execution failed (exit_code=2)
- Not all affected DC(s) have been migrated to automatic DNS, a manual patch to the operations/dns repository is required
ERROR: some step on some host failed, check the bolded items above
Change 627861 merged by Alexandros Kosiaris:
[operations/puppet@production] proton: remove all puppet code, other references to the non-k8s service
cookbooks.sre.hosts.decommission executed by akosiaris@cumin1001 for hosts: proton1002.eqiad.wmnet
- proton1002.eqiad.wmnet (PASS)
- Downtimed host on Icinga
- Found Ganeti VM
- VM shutdown
- Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox
- Removed from DebMonitor
- Removed from Puppet master and PuppetDB
- VM removed
- Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox
- COMMON_STEPS (FAIL)
- Failed to run the sre.dns.netbox cookbook: Cumin execution failed (exit_code=2)
- Not all affected DC(s) have been migrated to automatic DNS, a manual patch to the operations/dns repository is required
ERROR: some step on some host failed, check the bolded items above
Change 631398 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/dns@master] Remove proton{1,2}00{1,2}
cookbooks.sre.hosts.decommission executed by akosiaris@cumin1001 for hosts: proton2001.codfw.wmnet
- proton2001.codfw.wmnet (PASS)
- Downtimed host on Icinga
- Found Ganeti VM
- VM shutdown
- Started forced sync of VMs in Ganeti cluster ganeti01.svc.codfw.wmnet to Netbox
- Removed from DebMonitor
- Removed from Puppet master and PuppetDB
- VM removed
- Started forced sync of VMs in Ganeti cluster ganeti01.svc.codfw.wmnet to Netbox
- COMMON_STEPS (WARN)
- Not all affected DC(s) have been migrated to automatic DNS, a manual patch to the operations/dns repository is required
cookbooks.sre.hosts.decommission executed by akosiaris@cumin1001 for hosts: proton2002.codfw.wmnet
- proton2002.codfw.wmnet (PASS)
- Downtimed host on Icinga
- Found Ganeti VM
- VM shutdown
- Started forced sync of VMs in Ganeti cluster ganeti01.svc.codfw.wmnet to Netbox
- Removed from DebMonitor
- Removed from Puppet master and PuppetDB
- VM removed
- Started forced sync of VMs in Ganeti cluster ganeti01.svc.codfw.wmnet to Netbox
- COMMON_STEPS (WARN)
- Not all affected DC(s) have been migrated to automatic DNS, a manual patch to the operations/dns repository is required
Change 631398 merged by Alexandros Kosiaris:
[operations/dns@master] Remove proton{1,2}00{1,2}