Steps to replicate the issue (include links if applicable):
- curl -v -X 'GET' \ 'https://wikimedia.org/api/rest_v1/metrics/pageviews/top/eu.wikipedia.org/all-access/2024/05/03'
What happens?:
Currently for ME, this returns:
> GET /api/rest_v1/metrics/pageviews/top/eu.wikipedia.org/all-access/2024/05/03 HTTP/2 > Host: wikimedia.org > User-Agent: curl/8.4.0 > Accept: */* > < HTTP/2 301 < date: Sat, 04 May 2024 20:04:03 GMT < server: mw-web.eqiad.main-55b8c76fd7-k745s < location: https://www.wikimedia.org/wikimedia.org/v1/metrics/pageviews/top/eu.wikipedia.org/all-access/2024/05/03 < cache-control: max-age=2592000 < expires: Mon, 03 Jun 2024 20:04:03 GMT < content-length: 311 < content-type: text/html; charset=iso-8859-1 < vary: X-Forwarded-Proto < age: 2592 < x-cache: cp3068 hit, cp3068 hit/9 < x-cache-status: hit-front < server-timing: cache;desc="hit-front", host;desc="cp3068" < strict-transport-security: max-age=106384710; includeSubDomains; preload < report-to: { "group": "wm_nel", "max_age": 604800, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] } < nel: { "report_to": "wm_nel", "max_age": 604800, "failure_fraction": 0.05, "success_fraction": 0.0} < set-cookie: WMF-Last-Access=04-May-2024;Path=/;HttpOnly;secure;Expires=Wed, 05 Jun 2024 12:00:00 GMT < set-cookie: WMF-Last-Access-Global=04-May-2024;Path=/;Domain=.wikimedia.org;HttpOnly;secure;Expires=Wed, 05 Jun 2024 12:00:00 GMT < x-client-ip: 217.159.212.51 < set-cookie: GeoIP=EE:37:Tallinn:59.44:24.74:v4; Path=/; secure; Domain=.wikimedia.org < set-cookie: NetworkProbeLimit=0.001;Path=/;Secure;Max-Age=3600
Note: server mw-web.eqiad.main-55b8c76fd7-k745s, a cache hit on cp3068
For @Jdlrobson this returns: "envoy" and succeeds.
What should have happened instead?:
Should have received the results form that day, instead of a 301 redirecting to a BTW broken destination, which also has a caching period of 30 days.
Suspicion:
This page was first requested on the 4th of may. On the 4th the dataset might not YET have been available?.
Possibly this unavailability returns this 301 to redirect to a wikimedia.org 404 ? And this 301 response got cached for 30 days (if you happen to hit the datacenter/edge layer/caching webserver somewhere in between) and you will be unable to get the data for 30 days ?
Interestingly enough however, trying something in the future for me right now, returns 404 with application/json contents and does NOT redirect and i do get the response from server: envoy. Possibly something with these 404s in the new k8s hosts infra ?