Page MenuHomePhabricator

Migrate prometheus LB VIPs to IPIP encapsulation
Closed, ResolvedPublic

Description

As part of the ongoing efforts to replace pybal with liberica we need to switch prometheus load balancer service to IPIP encapsulation and mh-port scheduler.

Migration to IPIP can be performed per DC and the scheduler update will happen on both DCs simultaneously

You can use the swift/swift-https commits as an example. The whole process can be orchestrated using the sre.loadbalancer.migrate-service-ipip cookbook:

$ sudo -i cookbook sre.loadbalancer.migrate-service-ipip --help
usage: cookbook [GLOBAL_ARGS] sre.loadbalancer.migrate-service-ipip [-h] --dc {eqiad,codfw} --role ROLE services [services ...]

Migrate existing LVS services to IPIP

    Performed steps:
    1. Asks the user to perform the required hiera changes
    2. Runs puppet on LVS and realservers
    3. Validates that realservers are able to handle incoming IPIP traffic
    4. Restarts pybal on affected loadbalancers

    Usage:
        cookbook sre.loadbalancer.migrate-service-ipip --dc codfw --role swift::proxy swift swift-https
    

positional arguments:
  services            Service(s) to be migrated

optional arguments:
  -h, --help          show this help message and exit
  --dc {eqiad,codfw}  Target datacenter. One of eqiad, codfw. (default: None)
  --role ROLE         Puppet role used by the realservers (default: None)

Cookbook owner team: Traffic

Event Timeline

Change #1123379 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):

[operations/puppet@production] hiera,prometheus: Enable IPIP on prometheus(-https)?@codfw

https://gerrit.wikimedia.org/r/1123379

Change #1123380 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):

[operations/puppet@production] hiera,prometheus: Enable IPIP on prometheus(-https)?@eqiad

https://gerrit.wikimedia.org/r/1123380

Change #1123379 merged by Vgutierrez:

[operations/puppet@production] hiera,prometheus: Enable IPIP on prometheus(-https)?@codfw

https://gerrit.wikimedia.org/r/1123379

Mentioned in SAL (#wikimedia-operations) [2025-02-27T15:13:36Z] <vgutierrez@cumin1002> START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs2013.*,lvs2014.*} and A:lvs (T387302)

Mentioned in SAL (#wikimedia-operations) [2025-02-27T15:14:59Z] <vgutierrez@cumin1002> END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs2013.*,lvs2014.*} and A:lvs (T387302)

Change #1123380 merged by Vgutierrez:

[operations/puppet@production] hiera,prometheus: Enable IPIP on prometheus(-https)?@eqiad

https://gerrit.wikimedia.org/r/1123380

Vgutierrez claimed this task.