Page MenuHomePhabricator

"InvalidArgumentException: The revision does not belong to the given page." after 1.42.0-wmf.1 deployed to group1
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error
labels.normalized_message
[{reqId}] {exception_url}   InvalidArgumentException: The revision does not belong to the given page.
error.stack_trace
from /srv/mediawiki/php-1.42.0-wmf.1/includes/page/ParserOutputAccess.php(369)
#0 /srv/mediawiki/php-1.42.0-wmf.1/includes/page/ParserOutputAccess.php(293): MediaWiki\Page\ParserOutputAccess->checkPreconditions(MediaWiki\Page\PageStoreRecord, MediaWiki\Revision\RevisionStoreRecord, integer)
#1 /srv/mediawiki/php-1.42.0-wmf.1/includes/parser/Parsoid/ParsoidOutputAccess.php(222): MediaWiki\Page\ParserOutputAccess->getParserOutput(MediaWiki\Page\PageStoreRecord, ParserOptions, MediaWiki\Revision\RevisionStoreRecord, integer)
#2 /srv/mediawiki/php-1.42.0-wmf.1/includes/Rest/Handler/Helper/HtmlOutputRendererHelper.php(756): MediaWiki\Parser\Parsoid\ParsoidOutputAccess->getParserOutput(MediaWiki\Page\PageStoreRecord, ParserOptions, MediaWiki\Revision\RevisionStoreRecord, integer)
#3 /srv/mediawiki/php-1.42.0-wmf.1/includes/Rest/Handler/Helper/HtmlOutputRendererHelper.php(568): MediaWiki\Rest\Handler\Helper\HtmlOutputRendererHelper->getParserOutputInternal(ParserOptions)
#4 /srv/mediawiki/php-1.42.0-wmf.1/includes/Rest/Handler/Helper/HtmlOutputRendererHelper.php(664): MediaWiki\Rest\Handler\Helper\HtmlOutputRendererHelper->getParserOutput()
#5 /srv/mediawiki/php-1.42.0-wmf.1/includes/Rest/Handler/ParsoidHandler.php(882): MediaWiki\Rest\Handler\Helper\HtmlOutputRendererHelper->getPageBundle()
#6 /srv/mediawiki/php-1.42.0-wmf.1/vendor/wikimedia/parsoid/extension/src/Rest/Handler/PageHandler.php(92): MediaWiki\Rest\Handler\ParsoidHandler->wt2html(MediaWiki\Parser\Parsoid\Config\PageConfig, array)
#7 /srv/mediawiki/php-1.42.0-wmf.1/includes/Rest/Router.php(517): MWParsoid\Rest\Handler\PageHandler->execute()
#8 /srv/mediawiki/php-1.42.0-wmf.1/includes/Rest/Router.php(422): MediaWiki\Rest\Router->executeHandler(MWParsoid\Rest\Handler\PageHandler)
#9 /srv/mediawiki/php-1.42.0-wmf.1/includes/Rest/EntryPoint.php(195): MediaWiki\Rest\Router->execute(MediaWiki\Rest\RequestFromGlobals)
#10 /srv/mediawiki/php-1.42.0-wmf.1/includes/Rest/EntryPoint.php(135): MediaWiki\Rest\EntryPoint->execute()
#11 /srv/mediawiki/php-1.42.0-wmf.1/rest.php(31): MediaWiki\Rest\EntryPoint::main()
#12 /srv/mediawiki/w/rest.php(3): require(string)
#13 {main}
Impact
Notes

Seeing a few of these for T348354: 1.42.0-wmf.1 deployment blockers on group1. T348354#9259125 mentions ParserOutputAccess.

cc: @ssastry, @cscott

Details

Request URL
https://ca.wikipedia.org/w/rest.php/ca.wikipedia.org/v3/page/pagebundle/Prote%C3%AFna_PAT/32553374

Event Timeline

About 20 instances of this error in the last hour. And reproducible on the command line for the request url from this phab task.

╭─subbu@earth ~/work/wmf/core ‹master*› 
╰─➤  curl     https://ca.wikipedia.org/w/rest.php/ca.wikipedia.org/v3/page/pagebundle/Prote%C3%AFna_PAT/32553374 
{"message":"Error: exception of type InvalidArgumentException","httpCode":500,"httpReason":"Internal Server Error"}%

This revid is not the latest revision of the page.

But, the following two succeed:

So, it appears that non-latest revid requests where the title has url-encoding of entities is failing.

Scratch that .. I botched that.

The issue is that "Proteïna_PAT" is a redirect to "Proteïnes_PAT". So, asking for an oldid while giving the redirecting title is failing. The error now makes sense. The code to fetch the old id should first resolve the redirect. Otherwise, the revid will not match the original title. It only exists after the redirect is resolved.

This is not a new bug -- this is just an existing bug that got exposed since ParserOutputAccess has a 'checkPreconditions' which ParsoidOutputAccess did not have.

Who is generating these URLs to begin with? The URL is bad here if they are using the revid of a page which doesn't correspond to the actual title they give, but instead is a revision of the redirected title. After all, there *are* revisions of the page which contains the #REDIRECT. Eg:

Page A, Revision 1: Some article
Page A, Revision 2: Article blanked with a #REDIRECT to B

Page B, Revision 1: Some article
Page B, Revision 2: Article with content added from old revision 1 of A after articles were combined

Asking for revision 1 of page A should get the article before the #REDIRECT, not get revision 1 of page B. So our behavior here (as I'm understanding it, and I think this is what @subbu is saying?) is correct, and that the root cause is actually whatever (the app?) is generating the URL and mixing up the title and revision.

In this case, this is the only valid revision of the requested title:
https://ca.wikipedia.org/w/index.php?title=Prote%C3%AFna_PAT&oldid=32566127

But I guess there's still a bug here in that we should be quietly returning a 404, instead of a 500 and log spam (if that's what's happening).

Since revisions are unique across pages, a revid uniquely identifies a revision object and hence the page object. So, there is an "easy" resolution to this in ParsoidOutputAccess -- we can do a sanity check to see if the page is a redirect and redirects to the page that the revid belongs to and if so update the page object before asking ParserOutputAccess for the content .. But, in that case, we have to make it uncacheable.

The question is if this is the right behavior for the REST API.

I don't think that it is. I don't think that's how the parsoid/restbase version of the API worked (worth double-checking) and there are other means to do a revision id lookup ignoring title if that's what you actually want to do. @daniel?

I think the main hold up here is trying to figure out what the right behavior is ... the actual fix for whatever the behavior we decide is the right one is fairly simple ... We are going to resolve this tomorrow very likely. But, given the low volume of these "bad" requests, I don't think this should block the train rollout.

We are going to resolve this tomorrow very likely. But, given the low volume of these "bad" requests, I don't think this should block the train rollout.

Yeah, I'm ok going to group2 later today at this rate, unless there's a large spike in numbers here, but from a logspam perspective it would be to see this resolved before next week's train. Currently seeing 30-40 hour.

Thanks for the detailed investigation!

FTR, I think a request that contains a revid and a title should redirect if the revision belongs to a different page or the title is not normalized. For normalization, we should use 301. If the title doesn't match, we should use a 307 (because this may change when pages get renamed).

To be clear, we should not follow wiki redirects in the request revision. The response should always be the content of the specific revision.

EDIT: If the revision belongs to a different page, we could also error out... That would generally be my preference, but this situation can be created by a page move. After a page gets moved to a different title, clients should still be able to follow a URL that they previously received from the API. So a redirect seems in order.

Actually, what I wrote aqbove would apply for a public endpoint. If we are talking about a purely internal API, we can just ignroe the title and use the revid only.

Looks like about 400-600 of these an hour at group2.

I could use an opinion here between options:

  • Block current train and roll back purely on grounds of log volume. (Based on discussion here, I don't believe there's any new user-facing breakage, but it is noisy.)
  • Block next train, along the lines of the policy used for deprecation warnings.

I'll have a fix for this before the next train (hopefully before tomorrow). If it gets too noisy, we can even backport it before the next train.

Ok, sounds good. Let me know and I'll handle a backport. It's not not a problem right now per se, but it's always helpful to keep the noise floor low.

Change 967296 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/core@master] WIP: For Internal REST API users, make revid handling lenient

https://gerrit.wikimedia.org/r/967296

That patch above now is out of WIP and just needs someone to review. Once reviewed and merged, we could backport it tomorrow.

Change 967296 merged by jenkins-bot:

[mediawiki/core@master] For Internal REST API users, make revid handling lenient

https://gerrit.wikimedia.org/r/967296

ssastry claimed this task.

This will ride the train this week.

Change 972021 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/core@REL1_41] For Internal REST API users, make revid handling lenient

https://gerrit.wikimedia.org/r/972021

Change 972021 merged by jenkins-bot:

[mediawiki/core@REL1_41] For Internal REST API users, make revid handling lenient

https://gerrit.wikimedia.org/r/972021