Page MenuHomePhabricator

Configure ATS to allow fractional routing for api.wikimedia.org
Closed, ResolvedPublic

Description

In order to migrate different routes at different times from the API gateway to the REST gateway, we need to change the ATS configuration for api.wikimedia.org from a simple map to api-gateway.discovery.wmnet to use gateway-check.lua and multi-dc.lua

gateway-check.lua will handle fractionally routing to rest-gateway.discovery.wmnet based on URL patterns
multi-dc.lua support is needed to route to the correct instance of the rest-gateway.

  • Add api-gateway-ro discovery service to support multi-dc.lua (cheat by just doing a CNAME from api-gateway-ro to api-gateway as it's already an A/A discovery service and we only need it temporarily)
  • Change ATS config for api.wikimedia.org from map to the same lua param chain as the REST gateway reroutes, defaulting to api-gateway.discovery.wmnet

Event Timeline

Change #1244697 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/dns@master] wmnet: Add rest-gateway-ro record

https://gerrit.wikimedia.org/r/1244697

Change #1244700 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] api-gateway: Add api-gateway-ro to certificate

https://gerrit.wikimedia.org/r/1244700

Change #1245389 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/puppet@production] trafficserver: Support fractional routing for api.w.o

https://gerrit.wikimedia.org/r/1245389

Change #1244697 merged by Clément Goubert:

[operations/dns@master] wmnet: Add api-gateway-ro record

https://gerrit.wikimedia.org/r/1244697

Change #1244700 merged by jenkins-bot:

[operations/deployment-charts@master] api-gateway: Add api-gateway-ro to certificate

https://gerrit.wikimedia.org/r/1244700

Change #1245389 merged by Clément Goubert:

[operations/puppet@production] trafficserver: Support fractional routing for api.w.o

https://gerrit.wikimedia.org/r/1245389

This is now merged.

  • "standard" rest API paths under api.wikimedia.org -> rest-gateway
  • Browsing api.wikimedia.org wiki -> mw-web
  • API portal APIs -> api-gateway

This merge also removes the AQS1 style paths for device-analytics that were deprecated with the migration to AQS2.

Change #1259067 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/puppet@production] trafficserver: Add api.w.o to gateway-check.lua.conf

https://gerrit.wikimedia.org/r/1259067

Change #1259067 merged by Clément Goubert:

[operations/puppet@production] trafficserver: Add api.w.o to gateway-check.lua.conf

https://gerrit.wikimedia.org/r/1259067

Change #1259067 merged by Clément Goubert:

[operations/puppet@production] trafficserver: Add api.w.o to gateway-check.lua.conf

https://gerrit.wikimedia.org/r/1259067

That change got reverted this morning (T421203: Bad ATS config led to large volume of 5xx from RESTBase, https://gerrit.wikimedia.org/r/c/operations/puppet/+/1260559). Due to a syntax error we did not catch the config was never loaded until ATS got restarted for a different change and hard-failed to load it.

Change #1260624 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/puppet@production] trafficserver: Add api.w.o to gateway-check.lua.conf

https://gerrit.wikimedia.org/r/1260624

Change #1260624 merged by Clément Goubert:

[operations/puppet@production] trafficserver: Add api.w.o to gateway-check.lua.conf

https://gerrit.wikimedia.org/r/1260624

New patch with fixed syntax merged, resolving