Page MenuHomePhabricator

The `mobile-html` endpoint is sometimes returning outdated content when given the title of a redirect page (rather than an article's canonical title)
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (as an example, as of right now)

  1. Navigate to https://en.wikipedia.org/api/rest_v1/page/mobile-html/India_national_cricket_team (the mobile-html page for this article's canonical title). Note that the ETag response header states that the revision being used is 1313327976, which (at time of writing) is indeed the current revision of the article.
  2. Navigate to the mobile-html page for the title Indian_cricketer (which onwiki is a redirect to this article): https://en.wikipedia.org/api/rest_v1/page/mobile-html/Indian_cricketer.

What happens?
The mobile-html page for this redirected title returns an outdated version of the target article. The ETag response header says that the revision being used is 1312522885, a revision from 4+ days ago that has had 14 revisions come after it (as of the time of writing).

What should have happened instead?
Presumably, the endpoint should have either returned non-outdated content, or done something else (e.g. redirect to the endpoint for the article's canonical title).

Impact
Judging by a user-report here, this might mean that users of the Wikipedia apps who click on a wikilink to a redirect may be shown an outdated version of the target page.

Notes on reproducibility
I coded together an extremely hacky piece of JavaScript to run in my browser's console, that:
(a) uses the MediaWiki Action API to fetch a list of articles that have last been edited between 0.5hrs-1hr ago (in addition to those articles' most recent revision IDs, & the list of titles that redirect to those articles);
(b) queries the mobile-html endpoint for some of the redirect titles returned by the Action API; and
(c) compares the most recent revision ID of the target article (according to the Action API) to the revision ID of the content returned by mobile-html in response to the redirect title (according to its ETag header).

Running that script just now, it apparently found 11 mismatches (i.e., where the mobile-html ETag revID is different to the target article's latest revID) from querying the mobile-html endpoint for 51 redirect titles. (Disclaimer: I didn't independently verify each of these results, and I also have no idea how representative this is compared to the general scale.)

Other information/Notes

  • Boldly tagging Content-Transform-Team, as they've been tagged in previous issues for Page Content Service returning outdated content (e.g. T398243#10964438)
  • I don't know whether this applies to endpoints other than page/mobile-html -- it might do, but I haven't tested them :)
  • I don't know whether this is the same issue described in tasks such as T398243 or not (given that I can reproduce the issue using redirects to an article, but I haven't currently reproduced the issue when using an article's canonical title). Filing a new task for this just in case.
  • Also tagging the Wikipedia app projects here for visibility (as there's a report of this happening on the iOS app, and I assume that it'd occur on the Android app as well).

Event Timeline

Okay, so I ran my revID-mismatch-testing script again just now, and got no mismatches from checking the mobile-html for 100 enwiki redirects. Maybe something changed between last Thursday and now?

I ran the testing script again a few days ago (on ~100 enwiki redirects), and there were again no mismatches. I'll leave the decision here up to CTT (given that this task's been pulled into their work-in-progress board, so they might want to e.g. investigate the original root cause of this); but as the task filer, I just want to let it be known that I'd personally be okay with optimistically resolving this for now, given that I can no longer reproduce this in the way that I originally could. (If it's experienced again, this task could always be reopened.)

Closing this one. Lets reopen it if we encounter the same issue.

I'll just note that it shouldn't be considered unusual for there to be *some* lag between PCS and the action API. Up to 15 minutes or so? So it would be helpful if this bug seems to reoccur in the future to add some measure of 'time lag' (ie, how long ago the revision reported by the action API was created) so we know if the difference is within normal expectations.