replace pybal with liberica/ipvs on the PoPs.
Current status:
- esams
- ulsfo
- eqsin
- drmrs
- magru
replace pybal with liberica/ipvs on the PoPs.
Current status:
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Open | Vgutierrez | T332027 Replace current L4LB with with Katran-based alternative | |||
| Resolved | Vgutierrez | T384477 Replace pybal with liberica on the PoPs | |||
| Resolved | Vgutierrez | T385001 missing ipip-multiqueue-optimizer prometheus metrics for liberica cluster |
Mentioned in SAL (#wikimedia-operations) [2025-03-04T14:16:48Z] <vgutierrez> depooling lvs5004 before reimaging - T384477
Icinga downtime and Alertmanager silence (ID=3c84753d-9da7-4512-8291-9b672fc8b298) set by vgutierrez@cumin1002 for 0:30:00 on 1 host(s) and their services with reason: depooled before reimage
lvs5004.eqsin.wmnet
Change #1124407 merged by Vgutierrez:
[operations/puppet@production] hiera,site: Reimage lvs5004 as liberica
Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1002 for host lvs5004.eqsin.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1002 for host lvs5004.eqsin.wmnet with OS bookworm completed:
Change #1124459 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] hiera: Restore lvs5004 BGP priority
Change #1124459 merged by Vgutierrez:
[operations/puppet@production] hiera: Restore lvs5004 BGP priority
Mentioned in SAL (#wikimedia-operations) [2025-03-04T15:41:55Z] <vgutierrez> repooling lvs5004 running liberica - T384477
Change #1125162 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] cumin: Remove lvs-eqsin alias
Change #1125472 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] site,hiera: Reimage lvs6003 as liberica
Change #1125162 merged by Vgutierrez:
[operations/puppet@production] cumin: Remove lvs-eqsin alias
Mentioned in SAL (#wikimedia-operations) [2025-03-12T11:18:27Z] <vgutierrez> reimage lvs6003 as a liberica instance - T384477
Change #1125472 merged by Vgutierrez:
[operations/puppet@production] site,hiera: Reimage lvs6003 as liberica
Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1002 for host lvs6003.drmrs.wmnet with OS bookworm
Change #1126972 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] hiera: Fix NIC names for liberica@drmrs
Change #1126972 merged by Vgutierrez:
[operations/puppet@production] hiera: Fix NIC names for liberica@drmrs
Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1002 for host lvs6003.drmrs.wmnet with OS bookworm completed:
Change #1126974 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] site,hiera: Reimage lvs6002 as liberica
Mentioned in SAL (#wikimedia-operations) [2025-03-12T14:26:08Z] <vgutierrez> depooling lvs6002 before getting reimaged - T384477
Icinga downtime and Alertmanager silence (ID=6160b7b2-7281-4c01-a4ad-0c0ebed8103d) set by vgutierrez@cumin1002 for 0:30:00 on 1 host(s) and their services with reason: depooled before reimage
lvs6002.drmrs.wmnet
Change #1126974 merged by Vgutierrez:
[operations/puppet@production] site,hiera: Reimage lvs6002 as liberica
Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1002 for host lvs6002.drmrs.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1002 for host lvs6002.drmrs.wmnet with OS bookworm completed:
Change #1127053 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] hiera: Restore lvs6002 BGP priority
Change #1127053 merged by Vgutierrez:
[operations/puppet@production] hiera: Restore lvs6002 BGP priority
Change #1127062 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] site,hiera: Reimage lvs6001 as liberica
Mentioned in SAL (#wikimedia-operations) [2025-03-12T16:00:32Z] <vgutierrez@cumin1002> START - Cookbook sre.loadbalancer.admin config_reloading P{lvs6002.drmrs.wmnet} and A:liberica (T384477)
Mentioned in SAL (#wikimedia-operations) [2025-03-12T16:00:50Z] <vgutierrez@cumin1002> END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs6002.drmrs.wmnet} and A:liberica (T384477)
Mentioned in SAL (#wikimedia-operations) [2025-03-13T07:41:51Z] <vgutierrez> depool lvs6001 before being reimaged - T384477
Icinga downtime and Alertmanager silence (ID=2d81a5cc-8423-4910-ad45-d18cdfacb12e) set by vgutierrez@cumin1002 for 0:30:00 on 1 host(s) and their services with reason: depooled before reimage
lvs6001.drmrs.wmnet
Change #1127062 merged by Vgutierrez:
[operations/puppet@production] site,hiera: Reimage lvs6001 as liberica
Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1002 for host lvs6001.drmrs.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1002 for host lvs6001.drmrs.wmnet with OS bookworm executed with errors:
Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1002 for host lvs6001.drmrs.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1002 for host lvs6001.drmrs.wmnet with OS bookworm completed:
Change #1127464 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] hiera: Restore lvs6001 BGP priority
Change #1127464 merged by Vgutierrez:
[operations/puppet@production] hiera: Restore lvs6001 BGP priority
Mentioned in SAL (#wikimedia-operations) [2025-03-13T09:37:20Z] <vgutierrez@cumin1002> START - Cookbook sre.loadbalancer.admin config_reloading P{lvs6001.drmrs.wmnet} and A:liberica (T384477)
Mentioned in SAL (#wikimedia-operations) [2025-03-13T09:37:38Z] <vgutierrez@cumin1002> END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs6001.drmrs.wmnet} and A:liberica (T384477)
Change #1127471 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] cumin: Update (liberica|lvs)-drmrs aliases
Change #1127471 merged by Vgutierrez:
[operations/puppet@production] cumin: Update (liberica|lvs)-drmrs aliases
Change #1127853 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] site,hiera: Reimage lvs3010 as liberica
Change #1127853 merged by Vgutierrez:
[operations/puppet@production] site,hiera: Reimage lvs3010 as liberica
Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1002 for host lvs3010.esams.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1002 for host lvs3010.esams.wmnet with OS bookworm executed with errors:
Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1002 for host lvs3010.esams.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1002 for host lvs3010.esams.wmnet with OS bookworm completed:
Change #1128382 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] site,hiera: Reimage lvs3009 as liberica
Icinga downtime and Alertmanager silence (ID=00120896-d6ec-4aac-9b71-59479cad308d) set by vgutierrez@cumin1002 for 0:30:00 on 1 host(s) and their services with reason: depooled before reimage
lvs3009.esams.wmnet
Mentioned in SAL (#wikimedia-operations) [2025-03-17T13:08:06Z] <vgutierrez> depooling lvs3009 before being reimaged - T384477
Change #1128382 merged by Vgutierrez:
[operations/puppet@production] site,hiera: Reimage lvs3009 as liberica
Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1002 for host lvs3009.esams.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1002 for host lvs3009.esams.wmnet with OS bookworm completed:
Change #1128418 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] hiera: Restore lvs3009 BGP priority
Change #1128418 merged by Vgutierrez:
[operations/puppet@production] hiera: Restore lvs3009 BGP priority
Mentioned in SAL (#wikimedia-operations) [2025-03-17T14:09:51Z] <vgutierrez> repool lvs3009 running liberica - T384477
Mentioned in SAL (#wikimedia-operations) [2025-03-17T14:09:58Z] <vgutierrez@cumin1002> START - Cookbook sre.loadbalancer.admin config_reloading P{lvs3009.esams.wmnet} and A:liberica (T384477)
Mentioned in SAL (#wikimedia-operations) [2025-03-17T14:10:16Z] <vgutierrez@cumin1002> END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs3009.esams.wmnet} and A:liberica (T384477)
Change #1128421 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] site,hiera: Reimage lvs3008 as liberica
Change #1128421 merged by Vgutierrez:
[operations/puppet@production] site,hiera: Reimage lvs3008 as liberica
Mentioned in SAL (#wikimedia-operations) [2025-03-17T14:22:42Z] <vgutierrez> depooling lvs3008 before being reimaged - T384477
Icinga downtime and Alertmanager silence (ID=84eaa5ca-ad49-419d-9f2f-eb1dda5bf75d) set by vgutierrez@cumin1002 for 0:30:00 on 1 host(s) and their services with reason: depooled before reimage
lvs3008.esams.wmnet
Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1002 for host lvs3008.esams.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1002 for host lvs3008.esams.wmnet with OS bookworm completed:
Change #1128446 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] hiera: Restore BGP priority for lvs3008
Change #1128446 merged by Vgutierrez:
[operations/puppet@production] hiera: Restore BGP priority for lvs3008
Change #1128448 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] cumin: Update (liberica|lvs)-esams aliases
Change #1128449 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] hiera: Clean-up lvs::balancer keys for non-core DCs
Change #1128448 merged by Vgutierrez:
[operations/puppet@production] cumin: Update (liberica|lvs)-esams aliases
Mentioned in SAL (#wikimedia-operations) [2025-03-17T15:31:01Z] <vgutierrez> repool lvs3008 running liberica - T384477
Mentioned in SAL (#wikimedia-operations) [2025-03-17T15:31:10Z] <vgutierrez@cumin1002> START - Cookbook sre.loadbalancer.admin config_reloading P{lvs3008.esams.wmnet} and A:liberica (T384477)
Mentioned in SAL (#wikimedia-operations) [2025-03-17T15:31:28Z] <vgutierrez@cumin1002> END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs3008.esams.wmnet} and A:liberica (T384477)
Change #1128452 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):
[operations/puppet@production] hieradata: Use codfw etcd cluster in liberica@(ulsfo|eqsin)
Change #1128449 merged by Vgutierrez:
[operations/puppet@production] hiera: Clean-up lvs::balancer keys for non-core DCs
Change #1128452 merged by Vgutierrez:
[operations/puppet@production] hieradata: Use codfw etcd cluster in liberica@(ulsfo|eqsin)
@Vgutierrez I hit on a small discrepancy in Netbox, I think we just need to clean it up but wanted to check.
This port on asw1-b13-drmrs had the cable on port et-0/0/17 removed, however on the actual switch the port is still enabled (port config is still enabled in Netbox) and LLDP shows it's still connected to lvs6003.
cmooney@asw1-b13-drmrs> show interfaces terse | match et-0/0/17 et-0/0/17 up up et-0/0/17.0 up up eth-switch
{master:0}
cmooney@asw1-b13-drmrs> show lldp neighbors interface et-0/0/17 | match "Address :"
Address : e4:3d:1a:71:b5:71cmooney@lvs6003:~$ ip -br link show | grep "e4:3d:1a:71:b5:71" ens3f1np1 DOWN e4:3d:1a:71:b5:71 <BROADCAST,MULTICAST>
Have you been in touch with dc-ops about removing this cable on site? If not what I can do is re-add the cable to keep records up to date, but also disable this switch port as the lvs side has it disabled. Otherwise we need to work with dc-ops and remote hands to get it removed on site to match Netbox, after which we can disable the port too.
Have you been in touch with dc-ops about removing this cable on site?
Nope, I haven't performed any action that would lead to physical changes in any POP related to this task.
No stress. I've tidied it up now in Netbox, adding a cable to reflect the fact it's still connected on site, but removing the switch port configuration. Let's deal with removing the cable in T367731.