Page MenuHomePhabricator

CDN cache revalidation on several wikis for desktop improvements deployment pt 2
Closed, ResolvedPublic

Description

Background

Similar to T256750, we will be deploying the first major changes of the desktop improvements project to a set of test wikis. For this deployment, we would like to minimize the amount of pages seen with the old design. After a conversation with the traffic team, it was decided that the best way forward for this would be to turn caching off for these wikis for the period right before and after the deployment

Open questions

  • Which of these wikis will we be able to do this for safely: Turkish Wikipedia, Serbian Wikipedia, Korean Wikipedia, Portuguese Wikipedia, Bengali Wikipedia, German Wikivoyage, Venetian Wikipedia?

Deployment date

TBD, week of March 1

Event Timeline

This seems pretty straight-forward operationally, I think we can replicate the techniques used in T256750 for more wikis in general.

I think all of these wikis can be invalidated like this, although a couple of them (e.g. Portuguese Wikipedia) are big enough in terms of traffic volume that we might need to take some care to avoid overlapping other risks, and do them separately from the rest.

Is it realistic to break up the set into smaller chunks and invalidate them serially (could be on the same day, assuming all goes smoothly), or does it have to be all of these at once for validity of your statistics?

This seems pretty straight-forward operationally, I think we can replicate the techniques used in T256750 for more wikis in general.

I think all of these wikis can be invalidated like this, although a couple of them (e.g. Portuguese Wikipedia) are big enough in terms of traffic volume that we might need to take some care to avoid overlapping other risks, and do them separately from the rest.

Is it realistic to break up the set into smaller chunks and invalidate them serially (could be on the same day, assuming all goes smoothly), or does it have to be all of these at once for validity of your statistics?

That sounds perfect, thank you! And yes - we can do it in stages (either same day or 1-2 days apart for each set), with ptwiki having its own deployment.

jbond triaged this task as Medium priority.Feb 23 2021, 12:14 PM

@BBlack - From our side, we are ready to start scheduling these. How does the following sound:

  • Monday, March 8 - Turkish Wikipedia, Serbian Wikipedia, Korean Wikipedia, Bengali Wikipedia, German Wikivoyage, Venetian Wikipedia
  • Tuesday, March 9 - Portuguese Wikipedia

We can also break out some of the other large wikis (Korean or Turkish) into their own deployments if necessary.

@ovasileva Yes, that plan seems reasonable!

Just to be sure we're on the same page on details and I haven't made some silly errors in translation: the hostnames to effectively-purge are the desktop versions of all the named wikis, which are:

Mar 8:
Turkish Wikipedia: tr.wikipedia.org
Korean Wikipedia: ko.wikipedia.org
Serbian Wikipedia: sr.wikipedia.org
Bengali Wikipedia: bn.wikipedia.org
German Wikivoyage: de.wikivoyage.org
Venetian Wikipedia: vec.wikipedia.org

Mar 9:
Portuguese Wikipedia: pt.wikipedia.org

Recent 1-week history in webrequest 1/128 says that, roughly, ptwiki is ~0.36% of all traffic, while the other 6 combined add up to ~0.32%, so this seems like a good split!

@ovasileva Yes, that plan seems reasonable!

Just to be sure we're on the same page on details and I haven't made some silly errors in translation: the hostnames to effectively-purge are the desktop versions of all the named wikis, which are:

Mar 8:
Turkish Wikipedia: tr.wikipedia.org
Korean Wikipedia: ko.wikipedia.org
Serbian Wikipedia: sr.wikipedia.org
Bengali Wikipedia: bn.wikipedia.org
German Wikivoyage: de.wikivoyage.org
Venetian Wikipedia: vec.wikipedia.org

Mar 9:
Portuguese Wikipedia: pt.wikipedia.org

Yes, these are correct

Recent 1-week history in webrequest 1/128 says that, roughly, ptwiki is ~0.36% of all traffic, while the other 6 combined add up to ~0.32%, so this seems like a good split!

Great, thank you!
Last time, we scheduled a meeting with a member from Traffic during the deployment window and did the purge and deploy at the same time. Should we go ahead and do that this time as well?

Could this possibly turn out to be a solution for the issue described at T119366?

Change 669840 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/puppet@production] ATS: force cache revalidation for 6 wikis

https://gerrit.wikimedia.org/r/669840

Change 669840 merged by BBlack:
[operations/puppet@production] ATS: force cache revalidation for 7 wikis

https://gerrit.wikimedia.org/r/669840

Mentioned in SAL (#wikimedia-operations) [2021-03-08T19:37:09Z] <bblack> cp-text: banning varnish-fe for req.http.host == ( 7 wikis from T274784 )

^ There was a last-minute change of plans, so we made a last-minute call to expend a little bit of our overcautious-ness budget and do all 7 wikis at once (as opposed to ptwiki separately from the other 6).

Varnish-level graphs seem to have barely noticed that we did anything at all; all seems well!

ovasileva claimed this task.

^ There was a last-minute change of plans, so we made a last-minute call to expend a little bit of our overcautious-ness budget and do all 7 wikis at once (as opposed to ptwiki separately from the other 6).

Varnish-level graphs seem to have barely noticed that we did anything at all; all seems well!

Good news, thanks again @BBlack!

Change 825692 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):

[operations/puppet@production] Revert "ATS: force cache revalidation for 7 wikis"

https://gerrit.wikimedia.org/r/825692

Change 825692 merged by Vgutierrez:

[operations/puppet@production] Revert "ATS: force cache revalidation for 7 wikis"

https://gerrit.wikimedia.org/r/825692