Page MenuHomePhabricator

Stop sending purges for `action=history` for linked pages.
Closed, ResolvedPublic

Description

A good chunk of our purges (actually, about half of them) are related to a page history (action=history). This happens because we call Title::getCdnUrls in HtmlCacheUpdater:

https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/master/includes/cache/HtmlCacheUpdater.php#114

That function simply adds action=history to the urls to purge, unconditionally

https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/master/includes/Title.php#3553

this means that we send a purge request for the page history even for linked page purges, not just for direct edits, unless I'm missing some detail. Inspection of the PURGE requests arriving on varnish seem to confirm my hypothesis.

Now, when a page's cache gets invalidated because of a change in a linked page, the page's own history doesn't change, and thus we send a purge for no good reason.

We might be able to reduce the purge rate at the edge by ~ 40% if we stop sending purges for linked pages, and also improve the cache hit rate for the history pages..

Event Timeline

ema added a subscriber: ema.
fgiunchedi triaged this task as Medium priority.Apr 15 2020, 2:47 PM

The same also applies to the action=raw&ctype=text/javascript variants for CSS/JS pages. These make up a much smaller option of the purge volume, so in terms of production load it doesn't matter, but it shows that there are more than just the history use case in terms of CDN urls we purge on-edit that are not affected by re-parse updates due to updated blue/red links, files, templates or wikibase statements etc.

I propose we add a separate Title method for the subset of URLs that need purging for link updates (in other words, urls that reflect the result of re-parses, basically just HTML page views). The existing Title::getCdnUrls method would merge in that method's return value and upon direct edit we'll continue to use that method, and also anything third-party not yet aware of this distinction can continue to use that for compat.

But the purges from recursive link update etc. can call the new method instead. This is going to be tricky to handle right now I suspect because it is just not the url listing where we lack this distinction. The relevant purge methods, deferred updates and job classes also lack this distinction.

The HtmlCacheUpdater service class was introduced this week to help consolidate logic related to this (change 528924. That might be a good place to handle this.

I propose we add a separate Title method for the subset of URLs that need purging for link updates (in other words, urls that reflect the result of re-parses, basically just HTML page views).

Let's please not put more stuff into Title, I spend half my time moving things out of there :)

I was thinking we could have a service object that is responsible for turning titles into URLs (and lists of URLs) for different purposes. Not sure what that class should be called. PageAddressThings. PageUrlFrobniz.

I would frankly prefer to pass a flag to getCdnUrls, and return those dependent urls only if the flag has its default value. I say this because it won't make Title significantly more heavy, and at the same time it will fix *quickly* a problem we have in production.

We can decide to restructure/refactor the code once we've got the quick and overall clean win here. It means adding one parameter and one if to a function that's been lingering around since 2004.

An idea: Title::getCdnUrls() can be moved to HtmlCacheUpdater (recently introduced class).

Joe wrote:

I would frankly prefer to pass a flag to getCdnUrls, and return those dependent urls only if the flag has its default value.

Ladsgroup wrote:

An idea: Title::getCdnUrls() can be moved to HtmlCacheUpdater (recently introduced class).

I'd be fine with either, let's just not introduce new stuff into Title for code to bind to, which we then have to deprecate again in the next release :)

Alternatively, if all else fails: add methods, mark them @internal.

Change 589421 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] HtmlCacheUpdater: Add getUrls() method to support selective purging

https://gerrit.wikimedia.org/r/589421

Change 589425 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] HTMLCacheUpdateJob: Reduce recursive purging to PURGE_URLS_FOR_LINKUPDATE

https://gerrit.wikimedia.org/r/589425

Change 589661 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] title: Add unit tests for Title::getCdnUrls()

https://gerrit.wikimedia.org/r/589661

Change 589665 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] title: Remove broken handling of language variant in getCdnUrls()

https://gerrit.wikimedia.org/r/589665

Change 589661 merged by jenkins-bot:
[mediawiki/core@master] title: Add unit tests for Title::getCdnUrls()

https://gerrit.wikimedia.org/r/589661

Change 589665 merged by jenkins-bot:
[mediawiki/core@master] title: Remove broken handling of language variant in getCdnUrls()

https://gerrit.wikimedia.org/r/589665

Change 589421 merged by jenkins-bot:
[mediawiki/core@master] HtmlCacheUpdater: Add getUrls() method and support selective purging

https://gerrit.wikimedia.org/r/589421

Change 589425 merged by jenkins-bot:
[mediawiki/core@master] HTMLCacheUpdateJob: Enable PURGE_URLS_LINKSUPDATE_ONLY to reduce purges

https://gerrit.wikimedia.org/r/589425

This change was released to production to all wikis yesterday.

The effect can be seen in this 12h moving average of purge rates:

purges-text-12h.png (344×1 px, 30 KB)

so there was a very significant reduction in the number of purges we send.

Krinkle claimed this task.
Krinkle moved this task from Untriaged to Ready for write-up on the Performance-Team-publish board.