Page MenuHomePhabricator

Purge Varnish cache when a banner is saved
Closed, ResolvedPublic2 Story Points

Description

We often see caching delays when trying to preview recently edited banners using the ?banner=... parameter. This is especially pronounced on mobile. It would be helpful and allow us to develop faster if there was some method to bypass or purge the caching when previewing, maybe another URL parameter.

(At the moment we can often bypass the cache by adding a random uselang=.. parameter to the url, but this is a bit hacky and produces noise in the logs. Plus it has to be changed every time the banner is edited.)

Details

Related Gerrit Patches:

Event Timeline

Pcoombe created this task.Jan 10 2017, 12:22 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 10 2017, 12:22 AM

For testing purposes, does it help if you log in before previewing the banner on mobile? If so, would that solve the testing use case?

In any case, this shouldn't be hard to implement... I think we also talked about purging banner content from the front-end cache ahead of the December fundraiser... For emergencies, there's a script we can run on production. What about instead of (or in addition to) a URL parameter, a control in the CN admin UI to purge a banner from the cache in all languages?

AndyRussG set the point value for this task to 2.Jan 17 2017, 8:53 PM
ggellerman triaged this task as Medium priority.Jan 17 2017, 10:19 PM

What about instead of (or in addition to) a URL parameter, a control in the CN admin UI to purge a banner from the cache in all languages?

This is what @Pcoombe is asking for. let's discuss this option.

@AndyRussG and I talked. He will put the clear cache button near the preview link in the banner editing page.

If it's easy, return the user to the same UI with a message "cache cleared"

Thanks @AndyRussG, this sounds good.

Here's another idea, maybe simpler: what about just automatically purging the cache for any banner after it's edited? This would be in line with what we do for normal pages when they're edited.

@AndyRussG this was m first suggestion in our chat last week. You indicated it wasn't easy. If this really is simpler, then lets do it. Reducing click and UI is always good.

@DStrine aaarg maybe I misunderstood? /me scapegoats Hangout robots...

well lets proceed with the new suggestion. thanks for working on this!

Change 336237 had a related patch set uploaded (by AndyRussG):
[WIP] Purge banner content from front-end cache on banner save

https://gerrit.wikimedia.org/r/336237

AndyRussG renamed this task from Method to bypass/purge CentralNotice cache when forcing banner to Purge Varnish cache when a banner is saved.Feb 16 2017, 1:39 AM

Change 338160 had a related patch set uploaded (by AndyRussG):
Add $wgCentralSelectedMobileBannerDispatcher global for mobile

https://gerrit.wikimedia.org/r/338160

Change 338160 merged by jenkins-bot:
Add $wgCentralSelectedMobileBannerDispatcher global for mobile

https://gerrit.wikimedia.org/r/338160

AndyRussG edited projects, added Traffic; removed MediaWiki-Cache.Mar 10 2017, 7:28 PM
Restricted Application added a project: Operations. · View Herald TranscriptMar 10 2017, 7:28 PM

@BBlack, @ema, hi! Would it be possible to maybe get your input on the first patch attached to this task?

Banners are always fetched via Special:BannerLoader on meta (on desktop and mobile URLs). The calls have a few URL parameters that determine the banner, language, and some other details.

The patch runs through all the expected permutations and sends the purge via CdnCacheUpdate. If I'm understanding the workings of this correctly, it seems on our setup this goes through CdnCacheUpdate::HTCPPurge(), which sends URLs to vhtcpd.

The patch might purge 3000 or more URLs for each banner save, to go through all permutations of URL param values. One question is: this is OK for vhtcpd and downstream bits of the Varnish infrastructure?

To get a rough idea of how often saves usually happen, you can look at banner content logs. I guess it might be anywhere from 50-200 times a day on weekdays, maybe a little more during some periods?

Another question is: is there any way to do this more efficiently with wildcards or regexes? As far as I can tell, Varnish can only do that for bans, and what we want here is a purge, which is what our infrastructure handles, anyway... Is that correct?

BTW, if you think it's necessary that we significantly rework the patch (for example, doing the purges in smallish batches via a background job) we're certainly open to that! Just trying not to break anything here... Thanks much in advance!!!! :)

ema added a comment.Mar 13 2017, 12:44 PM

The patch might purge 3000 or more URLs for each banner save, to go through all permutations of URL param values. One question is: this is OK for vhtcpd and downstream bits of the Varnish infrastructure?

Uhm, 3K purges 50-200 times a day seem too many for banner updates. How about reducing the TTL through the Cache-Control header instead?

Another question is: is there any way to do this more efficiently with wildcards or regexes? As far as I can tell, Varnish can only do that for bans, and what we want here is a purge, which is what our infrastructure handles, anyway... Is that correct?

That's correct, PURGEs are issued against specific URLs and do not support wildcards/regexes. We've started working on XKey support, which would help a lot here, but are currently swamped and have no hard timeline for full production support yet.

ema moved this task from Triage to Caching on the Traffic board.Mar 14 2017, 11:55 AM

Uhm, 3K purges 50-200 times a day seem too many for banner updates. How about reducing the TTL through the Cache-Control header instead?

OK, thanks for the input! The default cache time is already pretty low (10 minutes)...

@Pcoombe, @DStrine, what if we limit the automatic post-save purge to the language of the user who edited the banner? That would reduce the number of URLs to purge by an order of magnitude or so (so, hopefully OK for our infrastructure) and might cover most use cases--what do you think?

We could then add a control in the UI to purge banners for specific languages on demand. What do you think?

PURGEs are issued against specific URLs and do not support wildcards/regexes. We've started working on XKey support, which would help a lot here, but are currently swamped and have no hard timeline for full production support yet.

Cool! Sounds fun!! :)

@AndyRussG That sounds fine to me.

I'm cool with this too. thanks!

ema added a comment.Mar 16 2017, 11:01 AM

If I understand the main issue at hand correctly, the goal here is to make sure that developers can quickly test their changes in production. Wouldn't it be better in general to do that in a development environment (beta labs or similar)?

Also, a simpler solution in prod would probably be setting Vary: Cookie in the response headers to make logged-in users bypass the cache. See the relevant varnish test case for an example of how that functionality works.

Ejegg added a subscriber: Ejegg.Mar 16 2017, 11:17 PM

Wouldn't Vary: Cookie explode caching all over the place due to things like the last visit date?

Change 343267 had a related patch set uploaded (by Ema):
[operations/puppet] cache_text varnishtest: 'Vary: Cookie' and Non-Session cookies

https://gerrit.wikimedia.org/r/343267

Change 343267 merged by Ema:
[operations/puppet] cache_text varnishtest: 'Vary: Cookie' and Non-Session cookies

https://gerrit.wikimedia.org/r/343267

ema added a comment.Mar 17 2017, 10:47 AM

Wouldn't Vary: Cookie explode caching all over the place due to things like the last visit date?

Nope! The cache is bypassed only in case of requests with session/token cookies and responses with Vary: Cookie.

I've expanded the test case to also cover non-session cookies. Here is the VCL in case you're interested in the implementation details. :)

@DStrine @AndyRussG Can we prioritise working on this again once banner sequencing is done? It would be a considerable boost to our productivity if we can get faster previewing.

AndyRussG added a comment.EditedJun 22 2017, 5:55 PM

The most recent version of the change in Gerrit now only purges URLs for the user's language.

@ema This reduces the number of URLs to purge from 3000 to 6. Is purging 6 URLs about 50-200 times a day OK for our our infrastructure?

@Pcoombe @DStrine This means that Varnish will only be purged for the language configured in the account of the user saving the banner. While I think this may take care of a number of use cases, I imagine there will be many times when you wish to purge in a different language... Maybe for those situations, we could add some controls to purge for one specific language at a time?

Thanks!!!

Change 336237 merged by jenkins-bot:
[mediawiki/extensions/CentralNotice@master] Purge banner content from front-end cache on banner save

https://gerrit.wikimedia.org/r/336237

@Pcoombe can you verify if this is working?

Pcoombe closed this task as Resolved.Aug 16 2017, 8:06 PM

Yes, and life is much easier. Thanks!