Revision API cannot be cached in varnish when requested by a user with deletedtext or similar rights
Open, Needs TriagePublic

Description

When a user with some right to see deleted content (deletedtext, deletedhistory etc) uses the revision API, deleted entries are returned and caching is set to private. This is usually not needed and should be possible to disable so that API results can be cached.

This is a generic problem that probably affects a few other APIs as well, but we have to start somewhere, and for the revision API there is a concrete use case where this gets in the way (T97096).

Tgr created this task.Jul 2 2015, 10:03 PM
Tgr updated the task description. (Show Details)
Tgr raised the priority of this task from to Needs Triage.
Tgr added projects: MediaWiki-API, Performance.
Tgr added a subscriber: Tgr.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 2 2015, 10:03 PM
Anomie added a subscriber: Anomie.EditedJul 6 2015, 4:18 PM

This is a generic problem that probably affects a few other APIs as well

Looking through core query modules, I see the following. There might be some opportunities in non-query modules or extensions too, but that didn't seem worth the time to look for.

For revdel:

  • prop=revisions
  • prop=deletedrevisions
  • list=alldeletedrevisions
  • list=recentchanges
  • prop=imageinfo, prop=stashimageinfo
  • list=logevents
  • list=usercontributions

Similar issues with different reasons, that could probably also be fixed by some sort of "public info only" flag (and maybe some logic to make public/private conditional on the extra rights):

  • list=allusers, because of the possibility for having the 'hideuser' right.
  • list=users, because of the possibility for having the 'hideuser' right.
  • list=blocks, because of the possibility for having the 'hideuser' right.
  • list=usercontributions, because of the possibility for patrol flags (like Gerrit change 220319).
  • list=recentchanges with show=patrolled or show=!patrolled (like Gerrit change 220319).
  • list=logevents when non-public logs (e.g. suppression log) are viewable by the current user.
  • prop=deletedrevisions, list=alldeletedrevisions, list=filearchive. Legal has cleared exposing certain information publicly here, and if I ever get around to actually doing that we'd have to do the same sort of "if user has advanced permissions, private, else public" deal.

Non-rights-related issues that might still be fixable:

  • meta=allmessages, when amlang is not given. Could probably be deprecated in favor of uselang.
  • meta=siteinfo&prop=interwikimap when $wgExtraInterlanguageLinkPrefixes is set, because it uses i18n messages. Could probably be fixed up now that the API supports uselang.
  • When prop=parsedcomment is used with list=recentchanges, list=logevents, list=protectedtitles. It looks like the only real issue there is Linker::formatAutocomments() uses $wgLang instead of $wgContLang.

Probably unfixable:

  • prop=info with inprop watched, watchers, notificationtimestamp, or readable. Fortunately not the default, just don't query them if you don't want them.
    • Also prop=info&intestactions=... should be private, but currently isn't. I'll fix that.
  • When deprecated token parameters are used with prop=info, list=recentchanges, prop=revisions, list=users.
  • meta=allmessages when amenableparser is used, since it uses user-specific parser preferences.
  • list=querypage when the queried special page has restrictions.
  • meta=userinfo, which exists entirely to give information about the current user.
  • list=watchlist when wlowner is used might be fixable, but is unlikely to be worth the effort (and it also has the revdel, patrol, parsedcomment, and deprecated token issues that list=recentchanges has). Unfixable in the default "current user's watchlist" mode.
  • list=watchlistraw when wrowner is used might be fixable, but is unlikely to be worth the effort. Unfixable in the default "current user's watchlist" mode.
Anomie added a comment.Jul 6 2015, 4:24 PM

As for a "public info only" flag, one easy way to do it would be to have a parameter to ApiMain to have it set $wgUser and the context user to an anon, much like it does already for jsonp mode.

As for a "public info only" flag, one easy way to do it would be to have a parameter to ApiMain to have it set $wgUser and the context user to an anon, much like it does already for jsonp mode.

A "public info only" flag exists in the form of a request header Treat-as-Untrusted since rMW9ec1ef7308ac: API: Add "standard" header and hook for lacksSameOriginSecurity()