Page MenuHomePhabricator

Enable IPv6 for the Cloud VPS web proxy
Closed, ResolvedPublic

Description

Enable IPv6 support for the Cloud VPS web proxy. And since this requires re-creating the VMs anyway, upgrade them to bookworm.

TODO:

  • Prepare Puppet code, test in codfw1dev
  • Provision pair of new proxies in dualstack network
  • Provision IPv4/IPv6 VIPs for Keepalived usage in the new network
  • Add new proxies to cache_hosts in Hiera
  • Merge Puppet patches, start provisioning AAAA records for new proxies
  • Add AAAA records to initial testers (WMCS-managed infrastructure + others relatively high-traffic proxies with active maintainers)
  • Contact projects that get XFF data to update filters on their side (I think this mostly includes ACC and UTRS)
  • Backfill security group rules to existing projects (P75871)
  • Flip floating IPv4 to new cluster
  • Backfill AAAA records
    • wmflabs.org
    • wmcloud.org
    • Everything else
    • Special cases
  • Cleanup old proxy instances

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
Resolvedtaavi
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolvedcmooney
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolvedcmooney
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolvedfnegri
Resolvedcmooney
Resolvedtaavi
Resolvedcmooney
Resolved aborrero
ResolvedJAllemandou
Resolvedtaavi
Resolvedtaavi
Resolvedtaavi
ResolvedMarostegui
ResolvedBUG REPORTtaavi

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change #1088338 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] dynamicproxy: Provision AAAA records

https://gerrit.wikimedia.org/r/1088338

Change #1088339 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] dynamicproxy: Canocalize IP addresses before comparing

https://gerrit.wikimedia.org/r/1088339

fnegri triaged this task as Medium priority.Nov 11 2024, 11:45 AM

Change #1091733 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] keepalived::failover: Support IPv6

https://gerrit.wikimedia.org/r/1091733

Change #1091796 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] dynamicproxy: Fix Lua path on bookworm

https://gerrit.wikimedia.org/r/1091796

Change #1091796 merged by Majavah:

[operations/puppet@production] dynamicproxy: Fix Lua path on bookworm

https://gerrit.wikimedia.org/r/1091796

Change #1091798 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] hieradata: Update codfw1dev cache hosts

https://gerrit.wikimedia.org/r/1091798

Change #1091802 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] dynamicproxy: Listen on IPv6

https://gerrit.wikimedia.org/r/1091802

Change #1091798 merged by Majavah:

[operations/puppet@production] hieradata: Update codfw1dev cache hosts

https://gerrit.wikimedia.org/r/1091798

Change #1091848 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] dynamicproxy: Run Redis update in app context

https://gerrit.wikimedia.org/r/1091848

Change #1091849 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] dynamicproxy: Bind Redis on IPv6

https://gerrit.wikimedia.org/r/1091849

Change #1091849 merged by Majavah:

[operations/puppet@production] dynamicproxy: Bind Redis on IPv6

https://gerrit.wikimedia.org/r/1091849

Change #1091802 merged by Majavah:

[operations/puppet@production] dynamicproxy: Listen on IPv6

https://gerrit.wikimedia.org/r/1091802

Change #1091848 merged by Majavah:

[operations/puppet@production] dynamicproxy: Run Redis update in app context

https://gerrit.wikimedia.org/r/1091848

Change #1088339 merged by Majavah:

[operations/puppet@production] dynamicproxy: Canocalize IP addresses before comparing

https://gerrit.wikimedia.org/r/1088339

One thing to keep in mind here is that many security groups reference 172.16.0.0/21 directly. I've updated the docs now but there are still old projects with old rules.

Mentioned in SAL (#wikimedia-cloud-feed) [2025-04-29T10:54:33Z] <taavi@cloudcumin1001> START - Cookbook wmcs.vps.create_instance_with_prefix with prefix 'proxy' (T379175)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-04-29T11:26:08Z] <taavi@cloudcumin1001> START - Cookbook wmcs.vps.create_instance_with_prefix with prefix 'proxy' (T379175)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-04-29T11:26:27Z] <taavi@cloudcumin1001> END (FAIL) - Cookbook wmcs.vps.create_instance_with_prefix (exit_code=99) with prefix 'proxy' (T379175)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-04-29T11:28:36Z] <taavi@cloudcumin1001> START - Cookbook wmcs.vps.create_instance_with_prefix with prefix 'proxy' (T379175)

Change #1139818 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] hieradata: Add new eqiad1 proxies

https://gerrit.wikimedia.org/r/1139818

Change #1139821 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:wmcs: novaproxy: Add separate keepalived_peers variable

https://gerrit.wikimedia.org/r/1139821

Change #1139818 merged by Majavah:

[operations/puppet@production] hieradata: Add new eqiad1 proxies

https://gerrit.wikimedia.org/r/1139818

Change #1139821 merged by Majavah:

[operations/puppet@production] P:wmcs: novaproxy: Add separate keepalived_peers variable

https://gerrit.wikimedia.org/r/1139821

Change #1139857 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] keepalived: Fix IPv6 support

https://gerrit.wikimedia.org/r/1139857

Change #1139857 merged by Majavah:

[operations/puppet@production] keepalived: Fix IPv6 support

https://gerrit.wikimedia.org/r/1139857

Change #1139877 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] keepalived: failover: Select unicast source v6 more reliably

https://gerrit.wikimedia.org/r/1139877

Change #1139877 merged by Majavah:

[operations/puppet@production] keepalived: failover: Select unicast source v6 more reliably

https://gerrit.wikimedia.org/r/1139877

Mentioned in SAL (#wikimedia-cloud) [2025-04-29T16:43:14Z] <taavi> add AAAA record to codesearch.wmcloud.org. as a first test for T379175

Change #1140125 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] dynamicproxy: Use lua-resty-redis from Debian package

https://gerrit.wikimedia.org/r/1140125

Change #1140125 merged by Majavah:

[operations/puppet@production] dynamicproxy: Use lua-resty-redis from Debian package

https://gerrit.wikimedia.org/r/1140125

Mentioned in SAL (#wikimedia-cloud) [2025-04-30T08:44:00Z] <taavi> add AAAA record to *.wmflabs.org (for redirects for proxies that have migrated to wmcloud.org) T379175

Change #1088338 merged by Majavah:

[operations/puppet@production] dynamicproxy: Provision AAAA records

https://gerrit.wikimedia.org/r/1088338

Mentioned in SAL (#wikimedia-cloud) [2025-05-07T12:48:37Z] <taavi> updating all security group rules referencing old 172.16.0.0/21 subnet to reference new ip space instead (T379175)

Noting for the records that the above-mentioned security group update resulted in https://wikitech.wikimedia.org/wiki/Incidents/2025-05-07_cloud-vps_security_groups_deleted.

I'm hoping to flip the floating IP (+ the Redis primary) to the new cluster tomorrow, and then look at backfilling AAAA records.

Mentioned in SAL (#wikimedia-cloud) [2025-05-08T08:22:57Z] <taavi> migrating 185.15.56.49 floating IP to the new proxy cluster T379175

Mentioned in SAL (#wikimedia-cloud) [2025-05-08T08:36:15Z] <taavi> flip redis/API primary status to proxy-5 T379175

Mentioned in SAL (#wikimedia-cloud) [2025-05-08T10:53:43Z] <taavi> backfilling AAAA records for web proxies using wmflabs.org T379175

taavi updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-cloud) [2025-05-08T11:31:59Z] <taavi> backfilling AAAA records for web proxies using wmcloud.org T379175

Mentioned in SAL (#wikimedia-cloud) [2025-06-10T09:11:51Z] <taavi> delete old proxy-03,04 instances T379175

Change #1155149 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] hieradata: Remove old Cloud VPS proxies

https://gerrit.wikimedia.org/r/1155149

Change #1155149 merged by Majavah:

[operations/puppet@production] hieradata: Remove old Cloud VPS proxies

https://gerrit.wikimedia.org/r/1155149

Closing, as T379283 meant I could fix the NFS edge cases and the remaining one (UTRS) is tracked elsewhere.