Page MenuHomePhabricator

CDN cache revalidation on several wikis for desktop improvements deployment
Closed, ResolvedPublic

Description

Background

We will be deploying the first major changes of the desktop improvements project to a set of test wikis. For this deployment, we would like to minimize the amount of pages seen with the old design. After a conversation with the traffic team, it was decided that the best way forward for this would be to turn caching off for these wikis for the period right before and after the deployment

Open questions

  • Which of these wikis will we be able to do this for safely: hewiki, euwiki, frwiktionary, ptwikiversity,fawiki?
  • Is it possible to do this in two stages (euwiki, frewiktionary, and ptwikiversity first, followed by hewiki and fawiki)?
  • How many days prior to the deployment should we disable caching?

Deployment date

July 22, 2020

Event Timeline

Task description

[…] on test wikis […]

  • Which of these wikis […] hewiki, euwiki, frwiktionary, ptwikiversity,fawiki?
  • Is it possible to […] (euwiki, frewiktionary, and ptwikiversity first, followed by hewiki and fawiki)?

Test wikis are test.wikipedia.org, test2.wikipedia.org. The above mentioned wikis however are all tier-1 production wikis. Disabling the CDN cache on any of these is imho not a viable option under any circumstances for any duration of time.

If I understand correctly, the need is to purge the cache of one entire wiki at once, after a particular configuration cache is made to the skin. This capabilitiy exists in Varnish. There is no need to prevent populating the cache for a week to have it be empty the next week. It can be emptied when you need it to be emptied, with cache protections working both before and after.

The Varnish concept for this purpose is known as a cache ban, which can be based on timestamps, urls or hostnames. In this case you'd use the hostname. I'll move this to our radar and tag Traffic as they operate our CDN infrastructure. Given the tight deadline, you'll probably want to reach out to them by other means sooner than that.

Krinkle renamed this task from Turn off cache for up to one week on test wikis for desktop improvements deployment to Turn off CDN cache for up to one week on several wikis for desktop improvements deployment.Jul 7 2020, 6:55 PM
Krinkle removed Krinkle as the assignee of this task.
Krinkle moved this task from Limbo to Watching on the Performance-Team (Radar) board.
Krinkle subscribed.

@Krinkle for background this ticket comes from a meeting with @BBlack who suggested the idea as being feasible. The proposal was to disable caching for a period of time predating a change to clear the caches, not indefinitely.

As @Krinkle pointed out, instead of turning off caching altogether we can invalidate the objects cached both on CDN backends (Apache Traffic Server) and frontends (Varnish) for the wikis in question. The concept is known in Varnish terminology as banning, and we can do something similar for ATS too.

The idea is that, after the desktop improvement changes are deployed, we can instruct the CDN to replace cached responses for requests matching a given pattern (for example Host: pt.wikiversity.org) with newly generated content. With the new CDN architecture based on ATS backends we now have one single cache layer behind the frontends, making this operations simpler than before as we do not have to worry about races between data centers and thus invalidating things in DC-tier order (eqiad/codfw first, edges later). The only ordering to take into consideration now is that ATS backends must be invalidated before Varnish frontends.

@ovasileva let's coordinate on IRC (channel #wikimedia-traffic) during EU working hours.

ema renamed this task from Turn off CDN cache for up to one week on several wikis for desktop improvements deployment to CDN cache revalidation on several wikis for desktop improvements deployment.Jul 8 2020, 8:00 AM

@ema - apologies for the late response - we had some blockers arise this week with the deployment and will need to change the date to the following Wednesday, July 22. Can switch to IRC to discuss timing further as well.

Change 615446 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: force cache revalidation on a few selected wikis

https://gerrit.wikimedia.org/r/615446

Change 615446 merged by Ema:
[operations/puppet@production] ATS: force cache revalidation on a few selected wikis

https://gerrit.wikimedia.org/r/615446

Mentioned in SAL (#wikimedia-operations) [2020-07-22T12:01:44Z] <ema> A:cp-text varnish ban frwiktionary T256750

Mentioned in SAL (#wikimedia-operations) [2020-07-22T12:05:37Z] <ema> A:cp-text varnish ban ptwikiversity T256750

euwiki, frewiktionary, and ptwikiversity done today. All good.

Change 616614 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] Revert "ATS: force cache revalidation on a few selected wikis"

https://gerrit.wikimedia.org/r/616614

Change 616614 merged by Ema:
[operations/puppet@production] Revert "ATS: force cache revalidation on a few selected wikis"

https://gerrit.wikimedia.org/r/616614

Change 616726 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: force fawiki, hewiki cache revalidation

https://gerrit.wikimedia.org/r/616726

Change 616726 merged by Ema:
[operations/puppet@production] ATS: force cache revalidation on a few wikis

https://gerrit.wikimedia.org/r/616726

Mentioned in SAL (#wikimedia-operations) [2020-07-28T11:25:12Z] <ema> A:cp-text varnish ban fa.wikipedia.org T256750

Mentioned in SAL (#wikimedia-operations) [2020-07-28T11:32:31Z] <ema> A:cp-text varnish ban he.wikipedia.org T256750

Mentioned in SAL (#wikimedia-operations) [2020-07-28T11:34:33Z] <ema> A:cp-text varnish ban eu.wikipedia.org T256750

Mentioned in SAL (#wikimedia-operations) [2020-07-28T11:36:40Z] <ema> A:cp-text varnish ban fr.wiktionary.org T256750

Mentioned in SAL (#wikimedia-operations) [2020-07-28T11:38:15Z] <ema> A:cp-text varnish ban pt.wikiversity.org T256750

@ovasileva: shall we close this now? If other invalidations are needed in the future to further extend the deployment we can file other tasks.

Change 618294 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] Revert "ATS: force cache revalidation on a few wikis"

https://gerrit.wikimedia.org/r/618294

ovasileva claimed this task.

Sounds good to me, thanks @ema!

Change 618294 merged by Ema:
[operations/puppet@production] Revert "ATS: force cache revalidation on a few wikis"

https://gerrit.wikimedia.org/r/618294

Restricted Application added a subscriber: Huji. · View Herald TranscriptMar 2 2021, 5:40 PM