Disable caching on the main page for anonymous users
Open, HighPublic

Description

Anonymous users cannot view the latest version of the main page unless they purge the cache or have their browsers do it for them. Is there any chance the caching mechanism can be altered?

Superyetkin updated the task description. (Show Details)
Superyetkin raised the priority of this task from to High.
Superyetkin added a subscriber: Superyetkin.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 23 2015, 9:37 AM

Today, on November 28th, for anonymous users the November 10th version of Estonian Wikipedia main page is shown.

Ijon raised the priority of this task from High to Unbreak Now!.Feb 12 2017, 7:44 PM
Ijon added a subscriber: Ijon.
Restricted Application added subscribers: Jay8g, TerraCodes. · View Herald TranscriptFeb 12 2017, 7:44 PM
Ijon added a comment.Feb 12 2017, 7:45 PM

I have changed this to Unbreak Now! -- it's not okay that our readers are getting stale versions of the main page, on any wiki. It should have been handled long ago.

bd808 edited projects, added Traffic; removed MediaWiki-Cache.Feb 12 2017, 10:27 PM
bd808 added a subscriber: bd808.

Is the problem primarily that the Main Page uses [[{{LOCALDAY}}. {{LOCALMONTHNAME}}]] [[{{LOCALYEAR}}]]? These are all magic words that are fundamentally incompatible with content caching. If they were changed daily using a bot instead MediaWiki would send out the proper purge events.

Ijon added a comment.Feb 12 2017, 10:32 PM

Thanks, @bd808! That's helpful insight. I guess the Estonian main page needs to be fixed not to use those.

trwiki is using {{#time:Y-m-d}} which again is incompatible with any type of caching.

bd808 added a comment.EditedFeb 12 2017, 10:35 PM

I would suggest, lowering this from UBN! back to High and changing the task summery to make the goal being educating various wiki communities about parse time magic words that should not be used to provide content or link targets.

trwiki is using {{#time:Y-m-d}} which again is incompatible with any type of caching.

Huh? We don't really allow features that are "incompatible with any type of caching." It sounds like the opposite is happening here, no? If we're showing stale content, then caching is obviously taking place.

I believe the day magic words/parser functions have some special logic that reduces the parser cache time.

I believe the day magic words/parser functions have some special logic that reduces the parser cache time.

Example: https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/MagicWord.php;dffa61be3e9d245871d04980ed584cfbdaef05e3$178.

There are many cache layers. It sounds like we're possibly not doing a great job purging/invalidating some HTML cache layer?

Is the problem primarily that the Main Page uses [[{{LOCALDAY}}. {{LOCALMONTHNAME}}]] [[{{LOCALYEAR}}]]? These are all magic words that are fundamentally incompatible with content caching. If they were changed daily using a bot instead MediaWiki would send out the proper purge events.

Let us take Spanish wiki for example: https://es.wikipedia.org/wiki/Wikipedia:Portada

I don't see any problem there. It works. I mean I can assume that LOCALYEAR isn't ok and CURRENTYEAR is ok, but is this really the case here?

I believe the day magic words/parser functions have some special logic that reduces the parser cache time.

Example: https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/MagicWord.php;dffa61be3e9d245871d04980ed584cfbdaef05e3$178.

There are many cache layers. It sounds like we're possibly not doing a great job purging/invalidating some HTML cache layer?

I'm not sure that those ParserCache TTL hints actually make it all the way out OutputPage::sendCacheControl() which sets the headers that Varnish would respond to. I've never tunneled all the way down to the parser function level before to see how this might work in practice. It looks to me like direct calls to OutputPage::setCdnMaxage() and/or OutputPage::lowerCdnMaxage() are needed to change the Cache: s-maxage=... value that OutputPage::output() adds to the response. I'm not seeing anywhere that Parser would trigger those calls.

Even English Wikipedia's main page uses Wikipedia:Today's featured article/{{#time:F j, Y}}, it's a very common trick. Using time-dependent magic words reduces the parser cache TTL (to either 24h or 1h, depending on the magic word). The #time parser function also sets the TTL, but it sets it on the preprocessor frame, which doesn't appear to propagate to the ParserCache. This would suggest that pages only using {{#time:}} but no magic words like {{CURRENT...}}/{{LOCAL...}} wouldn't get the right parser cache TTL set.

This would explain why the trwiki main page is reported to be stale (it only uses {{#time:}}, nothing else) and why enwiki's isn't (it uses {{CURRENTDAYNAME}}), but it doesn't explain why etwiki's is reported to be stale, since it uses {{LOCALDAY}}.

I went and viewed a few of these in incognito (around 10:50 UTC on Feb 13):

trwiki:

NewPP limit report
Parsed by mw1174
Cached time: 20170209213001
Cache expiry: 3600
Dynamic content: true

etwiki:

NewPP limit report
Parsed by mw1274
Cached time: 20170212201347
Cache expiry: 3600
Dynamic content: true

enwiki:

NewPP limit report
Parsed by mw1199
Cached time: 20170213104450
Cache expiry: 3600

eswiki:

NewPP limit report
Parsed by mw1182
Cached time: 20170213094253
Cache expiry: 3600
Dynamic content: true

enwiki looks fine, eswiki looks a little bit weird (just over an hour old while the cache expiry claims to be an hour), but etwiki is showing me a 14-hour old page and trwiki's is almost 4 days old. Maybe cache turnover or other strange effects are making this work for large wikis but not small ones?

(Also, trwiki's main page expiry is listed as 3600, so that suggests that {{#time:}} does set the TTL correctly, or else maybe a template used on that page uses a time-based magic word.)

Anomie added a subscriber: Anomie.Feb 13 2017, 2:39 PM

Some comments:

  • This seems like it may be a duplicate of T51803: Calculated Age of persons can become outdated until next cache purge.
  • {{#time:}}'s TTL doesn't seem to be being propagated, so that 3600 must be coming from somewhere else. In trwiki, it seems to be coming from Vikipedi:Anasayfa yeni başlık.
  • The documentation on PPFrame::getTTL() specifically states that it isn't propagated to the parser cache expiry. There's no indication in I412febf3 as to why not. Note if we do do that we may want to enforce some minimum, since {{#time:}} will correctly indicate a 1-second expiry if it's outputting the seconds.
  • It doesn't look like the parser cache expiry is propagated to the OutputPage expiry either.
  • A patch from 2013 to make #time set the parser cache TTL to 12 hours was eventually abandoned, with statements that magic words like {{CURRENTDAY}} should have their TTLs raised.
jrbs added a subscriber: jrbs.Feb 13 2017, 5:51 PM
greg lowered the priority of this task from Unbreak Now! to High.EditedFeb 13 2017, 6:24 PM
greg added a subscriber: greg.

This is not an emergency, lowering to High.

Ijon added a comment.Feb 13 2017, 9:48 PM

Agree re lowering to High. But it does seem like a bug and not etwiki's using inappropriate magic words.

Tgr added a subscriber: Tgr.Feb 13 2017, 10:08 PM

#time and co. are used on many pages and usually they do not require cache invalidation. For example {{update after}} compares its arguments to the current date to decide whether the page should be in some "needs update" category, which is absolutely no reason to limit HTML cache expiry to one day.

Maybe we should prevent CURRENT* from changing parser cache expiry as well, and introduce a dedicated magic word which does that instead (and correctly propagates to Varnish) so that it is easier to differentiate between maintenance-related date logic and actual volatile content.

#time and co. are used on many pages and usually they do not require cache invalidation. For example {{update after}} compares its arguments to the current date to decide whether the page should be in some "needs update" category, which is absolutely no reason to limit HTML cache expiry to one day.

Well, ideally it would limit cache expiry to "however much time is left until the comparison changes state". People do want the tag and the maintenance category to start showing up as close to the target date as possible. Lacking that knowledge of how its output is going to be used, though, {{CURRENTDAY}} limiting expiry to "tomorrow" isn't at all unreasonable from a behavior perspective (even though it probably sucks from a performance perspective).

Tgr added a comment.Feb 14 2017, 5:44 PM

Well, ideally it would limit cache expiry to "however much time is left until the comparison changes state".

Parser cache expiry, yes. Browser/varnish cache expiry, no. Logged-in users are already uncached and anonymous users don't care about maintenance categories (which are probably hidden anyway).

I removed the date info from the main page of Estonian Wikipedia, but it only helps to hide the issue and not to solve it (the weekly changing content is still affected). And the same problem applies to MediaWiki:Sitenotice and other announcements. For not-logged-in users they are sometimes visible and sometimes not and I've heard from some people, that on few occasions there are significant delays (some notices have been presented month later, when they aren't even relevant anymore).

ema moved this task from Triage to Caching on the Traffic board.Mar 2 2017, 10:05 AM
This comment was removed by StevenJ81.
Umar added a comment.Apr 1 2018, 7:55 PM

What should I do to fix the problem? Excuse for troubling!

For me, it seems that the issue has grown even bigger in time. The delay with Estonian Wikipedia is often like 3 weeks (!!!), that means not-logged-in-users hardly ever see up-to-the-date info when they visit the main page. Could it please be fixed somehow?

For me, it seems that the issue has grown even bigger in time. The delay with Estonian Wikipedia is often like 3 weeks (!!!), that means not-logged-in-users hardly ever see up-to-the-date info when they visit the main page. Could it please be fixed somehow?

Are you saying content (other than {{#time or {{LOCALDAY etc) is not updating?

Fwiw: im of the opinion that date magic words should reduce varnish cache to at least 24 hours, maybe six hours. Im doubtful that super long cache times for all pages in varnish are really that worth it...

BBlack added a subscriber: BBlack.Sat, Nov 17, 12:17 AM

Fwiw: im of the opinion that date magic words should reduce varnish cache to at least 24 hours, maybe six hours.

The Varnish caches already self-limit to 24 hours per layer, aside from honoring MW's CC/Max-age claims, but there are some deeper edge-case issues here:

  1. It's possible, especially for something like a hot Main_page, for the 24 hours to stretch to 48-72 hours due to imperfect refresh timing between the up 2-3 layers (depending on the edge).
  2. But, this is limited by the CC:maxage/Age values sent by MW (which are currently over-long)
  3. But also, Varnishes can potentially keep very hot objects alive much longer if MW lies about 304 Not Modified, see the long threads at T124954#2399694 . Things have moved on since even that thread was last updated, and we're probably overdue for turning MW's $SquidMaxAge down even further from its current value, but that's a bit broader in scope than this ticket. It's probably still 14 days, and we could test stepping it down in stages to 7 days and then even ~3 days at this point, before we have to re-consider edge cache issues too hard (about how long our "keep" timers are).

Im doubtful that super long cache times for all pages in varnish are really that worth it...

They're not, for hitrate, but they are important for other operational concerns about taking servers/DCs in and out of traffic flow without major disruptions. For that reason, we really try to avoid hot content having <1d TTLs for now.

greg removed a subscriber: greg.Sat, Nov 17, 12:18 AM

For me, it seems that the issue has grown even bigger in time. The delay with Estonian Wikipedia is often like 3 weeks (!!!), that means not-logged-in-users hardly ever see up-to-the-date info when they visit the main page. Could it please be fixed somehow?

Are you saying content (other than {{#time or {{LOCALDAY etc) is not updating?

Yes, that had always been the case. Just that date thing was very easily detectable and people often asked: "why is that thing wrong in the front page". (wikipedians on the other hand never notice that, as we are constantly logged-in) As we don't change the front page often (as content comes via sub-pages, that themselves are asked via HETKENÄDAL (CURRENTWEEK) magic word), then the system seems to treat that hight traffic page as something that needs cache updating only once a month.