Re-enable Squid updates in HTMLCacheUpdate
Closed, ResolvedPublic

bzimport set Reference to bz43341.
MZMcBride created this task.Via LegacyDec 22 2012, 4:15 AM
tstarling added a comment.Via ConduitDec 30 2012, 10:38 PM

Your wording suggests that you don't know what that code does. The change is for a specific kind of Squid update.

My change disabled Squid updates for pages that use a given template or image when there are more than 500 pages to purge. Re-enabling it would have caused the problem to recur: specifically apache overload due to the squid cache of a substantial fraction (say 10%) of the pages on the English Wikipedia being simultaneously purged.

Aaron's recent changes in this area caused updates of small sets of pages to also be disabled. They also made updates of templates with millions of invocations more robust, making it even more likely to cause disaster if the Squid update were simply re-enabled.

MZMcBride added a comment.Via ConduitDec 30 2012, 10:54 PM

(In reply to comment #1)

Your wording suggests that you don't know what that code does. The change is
for a specific kind of Squid update.

Quite right. The symptom I'm experiencing is that I'll occasionally reach pages such as http://en.wikipedia.org/wiki/Wikipedia:SCOTUSWORK that are outdated, while the target of the redirect (in this case, http://en.wikipedia.org/wiki/Wikipedia:WikiProject_U.S._Supreme_Court_cases/Reports) is up-to-date. This only happens when I'm logged out and appears to only happen with redirects. When I'm logged in, the page content served (via a redirect or via the target of a redirect) is always up-to-date.

To me, this suggests that Squid cache is not updating properly. The bugs listed as "see also"s to this bug (bug 29552 and bug 38879) seem to suggest Squid cache may be to blame as well. It was a comment at bug 38879 (specifically bug 38879 comment 11) that pointed to this live hack as a possible culprit, so I filed a separate bug for further investigation. If you believe this bug is simply a duplicate of bug 29552 or bug 38879 or some other bug, feel free to mark it as such.

Sorry my initial bug report wasn't clearer. I was taking a shot in the dark in an attempt to to get the problem I'm experiencing resolved.

MZMcBride added a comment.Via ConduitDec 30 2012, 11:08 PM

Just putting this here so I don't lose it forever:

me> TimStarling: I'm able to reproduce that bug fairly easily with a noticeboard, BTW. http://en.wikipedia.org/wiki/Wikipedia:BN reads "This page was last modified on 29 December 2012 at 23:40." while http://en.wikipedia.org/wiki/Wikipedia:Bureaucrats%27_noticeboard reads "This page was last modified on 30 December 2012 at 07:32."
me> TimStarling: Both requests are logged out.

Tim> yes, updates to redirects also go via that class
Tim> obviously there should be a limit to the number of pages that are purged simultaneously, instead of just disabling everything

MZMcBride added a comment.Via ConduitDec 31 2012, 12:39 AM

(In reply to comment #2)

It was a comment at bug 38879 (specifically bug 38879 comment 11) that
pointed to this live hack as a possible culprit, so I filed a separate
bug for further investigation.

Also bug 29552 comment 15 (though it's the same author).

I wrote a quick script that prints the "This page was last modified on " and "Served by " text for specified input pages. I tested with the following pairs:


base_url = 'http://en.wikipedia.org/wiki/'

pairs = [['Wikipedia:AN/I', 'Wikipedia:Administrators%27_noticeboard/Incidents'],

['Wikipedia:ANI', 'Wikipedia:Administrators%27_noticeboard/Incidents'],
['Wikipedia:BN', 'Wikipedia:Bureaucrats%27_noticeboard'],
['Wikipedia:BNB', 'Wikipedia:Bureaucrats%27_noticeboard']]

The results were:


http://en.wikipedia.org/wiki/Wikipedia:AN/I
18 December 2012 at 22:20.<br
mw22 in 0.196 secs. --
http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard/Incidents
31 December 2012 at 00:32.<br
srv210 in 0.177 secs.
http://en.wikipedia.org/wiki/Wikipedia:ANI
30 December 2012 at 19:32.<br
srv196 in 0.175 secs.
http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard/Incidents
31 December 2012 at 00:32.<br
srv210 in 0.177 secs.
http://en.wikipedia.org/wiki/Wikipedia:BN
29 December 2012 at 06:37.<br
mw52 in 1.415 secs. --
http://en.wikipedia.org/wiki/Wikipedia:Bureaucrats%27_noticeboard
31 December 2012 at 00:32.<br
mw37 in 0.202 secs. --
http://en.wikipedia.org/wiki/Wikipedia:BNB
31 December 2012 at 00:32.<br
srv229 in 0.153 secs.
http://en.wikipedia.org/wiki/Wikipedia:Bureaucrats%27_noticeboard
31 December 2012 at 00:32.<br

mw37 in 0.202 secs. --

Full script and output here: http://p.defau.lt/?BW9Xkb3KfYjiUhFQ88Zegw.

aaron added a comment.Via ConduitJan 3 2013, 12:57 AM

The wmf patch that is applied to master to make wmf branches causes this breakage. Its removed in https://gerrit.wikimedia.org/r/#/c/42055/1.

A proper core patch is in https://gerrit.wikimedia.org/r/#/c/42061/.

Bawolff added a comment.Via ConduitJan 3 2013, 11:24 AM

(In reply to comment #5)

The wmf patch that is applied to master to make wmf branches causes this
breakage. Its removed in https://gerrit.wikimedia.org/r/#/c/42055/1.

A proper core patch is in https://gerrit.wikimedia.org/r/#/c/42061/.

Yay! Out of curiosity, any idea roughly how large $wgMaxBacklinksInvalidate is going to be set to on Wikimedia wikis?

MZMcBride added a comment.Via ConduitJan 3 2013, 11:53 PM

(In reply to comment #5)

A proper core patch is in https://gerrit.wikimedia.org/r/#/c/42061/.

I don't understand this patch. From what I can tell, it seems to completely skip HTML cache invalidation if there are a lot of backlinks, but that wouldn't make any sense.

aaron added a comment.Via ConduitJan 11 2013, 5:53 AM

(In reply to comment #6)

(In reply to comment #5)
> The wmf patch that is applied to master to make wmf branches causes this
> breakage. Its removed in https://gerrit.wikimedia.org/r/#/c/42055/1.
>
> A proper core patch is in https://gerrit.wikimedia.org/r/#/c/42061/.

Yay! Out of curiosity, any idea roughly how large $wgMaxBacklinksInvalidate
is
going to be set to on Wikimedia wikis?

This was now merged and deployed (with the old patch removed).

Add Comment