Page MenuHomePhabricator

RESTBase API returning 404 for old revisions of pages
Closed, ResolvedPublic

Description

The internal monitoring system of an Wikimedia Enterprise customer showed some failure increase caused by HTTP 404 responses to RESTBase API requests.

Examples:

The error msg the customer sees is this:
{"type":"MediaWikiError/Not_Found","title":"rest-no-match","method":"get","detail":"The requested relative path (/v1/page/Ibrahim_El_Hadji_Tour%C3%A9/136588699/html) did not match any known handler","uri":"/w/rest.php/v1/page/Ibrahim_El_Hadji_Tour%C3%A9/136588699/html","errorKey":"rest-no-match","messageTranslations":{"it":"Il relativo percorso (/v1/page/Ibrahim_El_Hadji_Tour%C3%A9/136588699/html) richiesto non coincide con nessun controller conosciuto","en":"The requested relative path (/v1/page/Ibrahim_El_Hadji_Tour%C3%A9/136588699/html) did not match any known handler"},"httpCode":404,"httpReason":"Not Found"}

Event Timeline

ssastry renamed this task from Failure increase caused by missing rfd_html to RESTBase API returning 404 for old revisions of pages.Nov 5 2024, 4:59 PM
ssastry subscribed.

This affects all request URLs with a revision parameter. For example, https://en.wikipedia.org/api/rest_v1/page/html/Hospet returns a valid response but https://en.wikipedia.org/api/rest_v1/page/html/Hospet/1254733652 returns 404 even though that is the latest revision of the page.

ssastry updated the task description. (Show Details)

I think we have this fixed. It was an ordering rule thing, see https://gerrit.wikimedia.org/r/1087528 for the nitty gritty details. https://en.wikipedia.org/api/rest_v1/page/html/Hospet/1254733652 now returns content.

Edit: Ignore me and listen to @akosiaris almost-simultaneous comment directly above.

This is ongoing?

A rerouting change was activated for T374683: Switchover plan from RESTbase to REST Gateway for rest_v1/page/html and rest_v1/page/title endpoints, then fairly quickly backed out (see comments on that task for details). I would expect there to have been a flurry of errors during that period, but that we are now back to happy previous state.

ssastry assigned this task to hnowlan.
ssastry triaged this task as Unbreak Now! priority.

Hello @ssastry! Is there a post mortem that can be shared externally on this outage from November 5? A customer is asking. Thank you!

@akosiaris is better placed to respond if there is an incident report for this on wiki.

But, the short summary of this outage is as follows.

There was a configuration error in the REST gateway that treated the revision id as part of the title, i.e. when asked for revision 1254733652 for title Hospet with the request URL https://en.wikipedia.org/api/rest_v1/page/html/Hospet/1254733652, the configuration error caused the REST gateway to treat it as a request for a page titled "Hospet/1254733652" (which doesn't exist and hence the HTTP 404 response) rather than a request for revision 1254733652 for title "Hospet".

Hi,

No, we don't have an incident report for this. @ssastry's summary is pretty good if you want to pass that along.

Change #1099214 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/deployment-charts@master] rest-gateway: Comment about forwash slashes in

https://gerrit.wikimedia.org/r/1099214

Change #1099214 merged by jenkins-bot:

[operations/deployment-charts@master] rest-gateway: Comment about forwash slashes

https://gerrit.wikimedia.org/r/1099214

Hi,

No, we don't have an incident report for this. @ssastry's summary is pretty good if you want to pass that along.

Ok, thanks!