Page MenuHomePhabricator

Merge Toolforge Nginx front proxy into the existing K8s HAProxy setup
Closed, ResolvedPublic

Description

We want to eliminate the manual failover with the Toolforge dynamicproxy servers (SSL termination + grid engine routing). This is currently the only point on Toolforge k8s web requests that has a single point of failure, haproxy+ingress+worked pods will all automatically fail over on node failure. I essentially see two options here:

  • Completely remove that layer, and terminate TLS at either on haproxy or on the ingress, like on PAWS haproxy
  • Do automatic failover for Nginx (easy) and its Redis backend (complex for a "temporary" solution)

The first one is ideal in the long-term, but would require a workaround for grid engine tools (T282975) until the grid is deprecated and removed (a long time).

Details

Related Changes in Gerrit:
SubjectRepoBranchLines +/-
operations/puppetproduction+0 -14
operations/puppetproduction+0 -381
operations/puppetproduction+0 -5
operations/puppetproduction+1 -6
operations/puppetproduction+19 -3
operations/puppetproduction+23 -18
operations/puppetproduction+20 -30
operations/puppetproduction+42 -11
operations/puppetproduction+0 -21
operations/puppetproduction+8 -8
operations/puppetproduction+14 -20
operations/puppetproduction+3 -5
operations/puppetproduction+6 -6
operations/puppetproduction+2 -1
operations/puppetproduction+30 -27
operations/puppetproduction+29 -0
operations/puppetproduction+17 -206
operations/puppetproduction+0 -23
operations/puppetproduction+8 -5
operations/puppetproduction+72 -3
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change #1189801 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::k8s::haproxy: Add a HTTPS listener

https://gerrit.wikimedia.org/r/1189801

Change #1189802 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::proxy: Set backend to new HAProxy HTTPS service

https://gerrit.wikimedia.org/r/1189802

Change #1189801 merged by Majavah:

[operations/puppet@production] P:toolforge::k8s::haproxy: Add a HTTPS listener

https://gerrit.wikimedia.org/r/1189801

Change #1189802 merged by Majavah:

[operations/puppet@production] P:toolforge::proxy: Set backend to new HAProxy HTTPS service

https://gerrit.wikimedia.org/r/1189802

Change #1189840 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::k8s::haproxy: Drop old TCP listener

https://gerrit.wikimedia.org/r/1189840

Change #1189841 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::k8s::haproxy: Handle API gateway external access

https://gerrit.wikimedia.org/r/1189841

Change #1189840 merged by Majavah:

[operations/puppet@production] P:toolforge::k8s::haproxy: Drop old TCP listener

https://gerrit.wikimedia.org/r/1189840

Change #1189841 merged by Majavah:

[operations/puppet@production] P:toolforge::k8s::haproxy: Handle API gateway external access

https://gerrit.wikimedia.org/r/1189841

taavi renamed this task from Eliminate single point of failure from Toolforge front proxy to Merge Toolforge Nginx front proxy into the existing K8s HAProxy setup.Sep 25 2025, 9:17 AM
taavi updated the task description. (Show Details)

Change #1191582 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::k8s::haproxy: Use custom error pages

https://gerrit.wikimedia.org/r/1191582

Change #1191583 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::k8s::haproxy: Move per-tool rate limiting to HAProxy

https://gerrit.wikimedia.org/r/1191583

Change #1191584 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::k8s::haproxy: Fix tool address in CSP header

https://gerrit.wikimedia.org/r/1191584

Change #1191582 merged by Majavah:

[operations/puppet@production] P:toolforge::k8s::haproxy: Use custom error pages

https://gerrit.wikimedia.org/r/1191582

Change #1191583 merged by Majavah:

[operations/puppet@production] P:toolforge::k8s::haproxy: Move per-tool rate limiting to HAProxy

https://gerrit.wikimedia.org/r/1191583

Change #1191584 merged by Majavah:

[operations/puppet@production] P:toolforge::k8s::haproxy: Fix tool address in CSP header

https://gerrit.wikimedia.org/r/1191584

Change #1193448 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge: Move ru_monuments backwards compat redirect to HAProxy

https://gerrit.wikimedia.org/r/1193448

Change #1193449 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge: Move U-A/Referer blocks to HAProxy

https://gerrit.wikimedia.org/r/1193449

Change #1193450 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge: Move http redirect rewrite to HAProxy

https://gerrit.wikimedia.org/r/1193450

Change #1193451 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::k8s::haproxy: Use http-after-response for headers

https://gerrit.wikimedia.org/r/1193451

Change #1193454 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::proxy: Remove config moved to HAProxy

https://gerrit.wikimedia.org/r/1193454

Change #1193451 merged by Majavah:

[operations/puppet@production] P:toolforge::k8s::haproxy: Use http-after-response for headers

https://gerrit.wikimedia.org/r/1193451

Change #1193448 merged by Majavah:

[operations/puppet@production] P:toolforge: Move ru_monuments backwards compat redirect to HAProxy

https://gerrit.wikimedia.org/r/1193448

Change #1193449 merged by Majavah:

[operations/puppet@production] P:toolforge: Move U-A/Referer blocks to HAProxy

https://gerrit.wikimedia.org/r/1193449

Change #1193450 merged by Majavah:

[operations/puppet@production] P:toolforge: Move http redirect rewrite to HAProxy

https://gerrit.wikimedia.org/r/1193450

Change #1193454 merged by Majavah:

[operations/puppet@production] P:toolforge::proxy: Remove config moved to HAProxy

https://gerrit.wikimedia.org/r/1193454

Change #1194532 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge: Migrate error page static assets to tools-static

https://gerrit.wikimedia.org/r/1194532

Change #1194533 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge: Migrate default robots and favicon handlers to HAProxy

https://gerrit.wikimedia.org/r/1194533

Change #1194532 merged by Majavah:

[operations/puppet@production] P:toolforge: Migrate error page static assets to tools-static

https://gerrit.wikimedia.org/r/1194532

Change #1194533 merged by Majavah:

[operations/puppet@production] P:toolforge: Migrate default robots and favicon handlers to HAProxy

https://gerrit.wikimedia.org/r/1194533

Change #1194881 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::k8s::haproxy: Add banned IP list

https://gerrit.wikimedia.org/r/1194881

Change #1194882 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::k8s::haproxy: Add per-IP rate limiting

https://gerrit.wikimedia.org/r/1194882

Change #1194881 merged by Majavah:

[operations/puppet@production] P:toolforge::k8s::haproxy: Add banned IP list

https://gerrit.wikimedia.org/r/1194881

Change #1194882 merged by Majavah:

[operations/puppet@production] P:toolforge::k8s::haproxy: Add per-IP rate limiting

https://gerrit.wikimedia.org/r/1194882

Mentioned in SAL (#wikimedia-cloud) [2025-10-22T12:35:40Z] <taavi> moving toolforge traffic to haproxy directly T283948

Change #1198049 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::k8s::haproxy: Drop proxy IP rate limit exemption

https://gerrit.wikimedia.org/r/1198049

Change #1198050 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge: Remove separate proxy role

https://gerrit.wikimedia.org/r/1198050

Change #1198049 merged by Majavah:

[operations/puppet@production] P:toolforge::k8s::haproxy: Drop proxy IP rate limit exemption

https://gerrit.wikimedia.org/r/1198049

Change #1198105 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::prometheus: Drop separate front proxy scrape target

https://gerrit.wikimedia.org/r/1198105

Change #1198105 merged by Majavah:

[operations/puppet@production] P:toolforge::prometheus: Drop separate front proxy scrape target

https://gerrit.wikimedia.org/r/1198105

Mentioned in SAL (#wikimedia-cloud) [2025-10-23T10:11:11Z] <taavi> deleting old nginx front proxy instances T283948

Change #1198050 merged by Majavah:

[operations/puppet@production] P:toolforge: Remove separate proxy role

https://gerrit.wikimedia.org/r/1198050

Change #1198349 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge: Remove obsolete spec test

https://gerrit.wikimedia.org/r/1198349

Change #1198349 merged by Majavah:

[operations/puppet@production] P:toolforge: Remove obsolete spec test

https://gerrit.wikimedia.org/r/1198349