The API defaults to using the user language, which means the response depends on the identity of the user and cannot be cached publicly for logged-in users (unless uselang=content or something similar is set). Given that most API modules don't have any language-dependent part, this is unintuitive and confusing.
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Add maxage, s-maxage and uselang parameters in API call | mediawiki/extensions/Popups | master | +4 -1 |
Related Objects
- Mentioned In
- T155314: Varnish does not cache Action API responses when logged in
T153388: mw.widgets.TitleWidget should cache (client-side) the getInterwikiPrefixesPromise result
T126455: Request cached copy of machine generated Related Articles
T108909: Investigate caching of API requests
rMEXTea0f1797f44d: Updated mediawiki/extensions Project: mediawiki/extensions/Popups…
rEPOPd0486550e94d: Add maxage, s-maxage and uselang parameters in API call
T104650: Revision API cannot be cached in varnish when requested by a user with deletedtext or similar rights
T100262: [SPIKE] Which API requests should be cached and for how long?
T88170: Overall performance review of Popups/Hovercards extension - Mentioned Here
- T104650: Revision API cannot be cached in varnish when requested by a user with deletedtext or similar rights
P867 hovercards api
Event Timeline
That's a general API regression: there's "ApiMain::setCacheMode: downgrading cache mode 'public' to 'anon-public-user-private' due to uselang=user" in debug log due to ApiMain::setCacheMode().
The PopUps extension should be specifying maxage and/or smaxage parameters and a language other than "user" if it wants cacheability.
I tried to have the default not be "user", but people complained. See https://lists.wikimedia.org/pipermail/wikitech-l/2014-November/079287.html and https://gerrit.wikimedia.org/r/#/c/171050/. Also an IRC discussion with Ori on 2014-11-04, which unfortunately is before <#mediawiki-core> had a bot. I can post it from my own logs if Ori doesn't mind.
I believe this should be revisited. That API requests are never cached for a module which has nothing whatsoever to do with languages, unless the caller sets a language, is extremely unintuitive and means that most API requests will have suboptimal performance. Also, returning the wrong language is obvious and will force the developer to fix the request parameters, but having suboptimal performance is non-obvious and chances are no one will notice. (This was caught after one year, and only when it contributed to a site outage.) With our recent focus on performance, this choice of default is IMO indefensible. At a mininum, API modules which are heavily involved with languages should explicitly set their default to user and everything else should default to content.
Having different defaults for different modules would be painful, particularly here where the ContextSource is set up well before the module being used is determined. And when you get into query submodules the situation gets even worse.
What would be the way to discuss defaulting to contentlang, then? Make an RfC about it?
Change 222287 had a related patch set uploaded (by Prtksxna):
Add maxage, s-maxage and uselang parameters in API call
I am getting inconsistent results based on whether I test using Chrome or curl. I get private and must-revalidate, along with max-age on Chrome, but when I use curl I get maxage, s-maxage and public. Note that I am using uselang=content for these tests.
From the reading I did, I understand that having private and must-revalidate would essentially mean that maxage and s-maxage would be ignored. Is that right?
Chrome—
curl—
More likely logged-in versus logged-out, and when you're logged-in you're using a user account that has the rights necessary to see revision-deleted content (those being any of deletedhistory, deletedtext, suppressrevision, or viewsuppressed). prop=revisions sets 'private' mode in that case to avoid revision-deleted content being accidentally included in caches somehow.
Yep, that was it.
when you're logged-in you're using a user account that has the rights necessary to see revision-deleted content (those being any of deletedhistory, deletedtext, suppressrevision, or viewsuppressed). prop=revisions sets 'private' mode in that case to avoid revision-deleted content being accidentally included in caches somehow.
I see, and does private imply that there is effectively no caching for that request?
Short answer: no, some caching is allowed.
Long answer:
The header in question is
Cache-control: private, must-revalidate, max-age=300
"private" means that the response may not be cached in a shared cache, such as varnish. The user's browser may still cache it (for the individual user), subject to other conditions. Same goes for any other "private" (i.e. non-shared) cache.
"must-revalidate" means that the cache (the browser here) can't keep the response after it's considered stale, and "max-age=300" says it's considered stale after 5 minutes.
Understood. Thanks for taking the time to explain this @Anomie
So, the patch I submitted will have some positive effect! Mind reviewing it?
Would it be more performant to have to separate API calls, one that gets the text extract and image, and another that gets the revisions? That way at least the first call could get cached at varnish?
That depends on whether the increased cacheability outweighs the overhead of making two requests; I have no idea. And for users who don't have advanced rights, the prop=revisions bit should also be cacheable already.
I doubt one uncacheable call and two cacheable ones would be an improvement over a single uncacheable call. In general, I wouldn't worry about cache-control: private for admins (or even all logged-in users). They are a tiny part of all requests sent so the strain on the servers is negligible. They will see the cards slightly slower, but I doubt that's a huge deal.
In the longer term, maybe we can add a noprivate=1 flag to the API to disable all non-public fields in the output and use proper caching.
Change 222287 merged by jenkins-bot:
Add maxage, s-maxage and uselang parameters in API call
Not really resolved, Popups has just been hardcoded to work around it. (I might have made a mess by not separating the Popups issue and the general issue; sorry about that. The general issue is more important and still needs addressing, though.)
Is this a regression? i.e. were API requests cached by default previously?
If so, when was it last working correctly?
API requests in general aren't marked as publicly cacheable by default, but the maxage and smaxage parameters enable it. Individual actions can default to public caching.
Previously, some API modules flagged uncacheable because they know they're depending on the user language (and some even recognized the "uselang" parameter (everything else did too, these just explicitly did)). Others just ignored a user language dependency, allowing for bad caching. These dependencies can be hidden deep in MediaWiki code or come from hooks or extensions, so it's not obvious from just looking at the API module itself.
Ib14c00df made the whole API recognize "uselang", and made the default be the wiki's content language so public cacheability would be preserved by default, and made uselang=user force no public caching to avoid trouble from these hidden dependencies. But people complained about that default not matching index.php, so Idc24bfc switched the default to the user's language.
So it's a bugfix or no-change for modules that somehow depend on the user language and a minor break for modules that don't, with no simple way to comprehensively define those sets.