Page MenuHomePhabricator

Move all purge traffic to kafka
Closed, ResolvedPublic

Description

The kafka topic mediawiki.job.cdnPurge is currently receiving many (most?) purge messages. Talking with @Joe I found out that not all the HTCP purges currently sent are also written to kafka .

To help with T133821, we would need to ensure that every time a multicast purge is sent, the purge is written to kafka as well.

Details

ProjectBranchLines +/-Subject
operations/mediawiki-configmaster+0 -9
operations/mediawiki-configmaster+0 -25
operations/mediawiki-configmaster+3 -24
mediawiki/extensions/EventBusmaster+2 -1
operations/mediawiki-configmaster+3 -4
operations/puppetproduction+5 -5
operations/mediawiki-configmaster+14 -19
operations/mediawiki-configmaster+10 -2
operations/mediawiki-configmaster+1 -2
operations/puppetproduction+4 -0
operations/mediawiki-configmaster+1 -1
operations/mediawiki-configmaster+12 -6
mediawiki/extensions/EventBusmaster+89 -23
operations/mediawiki-configmaster+12 -0
operations/deployment-chartsmaster+6 -0
operations/mediawiki-configmaster+9 -0
mediawiki/extensions/EventBusmaster+131 -4
mediawiki/coremaster+372 -0
Show related patches Customize query in gerrit

Event Timeline

ema created this task.Apr 21 2020, 8:26 AM
Restricted Application added a project: Operations. · View Herald TranscriptApr 21 2020, 8:26 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ema moved this task from Triage to Watching on the Traffic board.Apr 21 2020, 8:29 AM
Pchelolo added a subscriber: Pchelolo.EditedApr 21 2020, 8:20 PM

AFAIK cdnPurgeJob is only involved if the delayed purge is required if reboundDelay option is set. For every rebound purge job there's an instant purge multicast.

Is the request here to move all purges onto the jobqueue, regardless of whether they're delayed vs immediate?

holger.knust triaged this task as Medium priority.Apr 21 2020, 8:25 PM
holger.knust moved this task from Inbox to Tracking/Watching on the Core Platform Team board.
ema added a comment.Apr 29 2020, 8:23 AM

AFAIK cdnPurgeJob is only involved if the delayed purge is required if reboundDelay option is set. For every rebound purge job there's an instant purge multicast.

Is the request here to move all purges onto the jobqueue, regardless of whether they're delayed vs immediate?

@Joe might be able to answer this question? Traffic would like to be able to read from kafka the same purges we currently read as multicast HTCP packets, that's what I know. :)

Krinkle added a subscriber: Krinkle.

The kafka topic mediawiki.job.cdnPurge is currently receiving many (most?) purge messages.

Maybe most by volume, but it's semantically very diferrent and a rather internal detail of when it is and isn't involving a job. The job spec is also considered internal to MW and no other stuff should rely on that.

In 2016, as part of T97562 and T125138, an event relay channel was set up for this in MediaWiki

Change 261595 merged by jenkins-bot:
Make CDN purges send EventRelayer events

https://gerrit.wikimedia.org/r/261595

Change 289010 merged by jenkins-bot:
Convert CdnCacheUpdate to event per URL

https://gerrit.wikimedia.org/r/289010

This still exists today and is ready to be subscribed to e.g. by EventBus to to send things to Kafka, or anything else we want to do with it.

There is also a EventRelayerKafka class in MediaWiki that woudl talk to Kafka directly. It was originally created for broadcasting WANObjectCache purge events, before we decided to go with mcrouter.

Current event produder is is in CdnCacheUpdate.php

Change 596463 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@master] Add EventBus backed EventRelayer for CDN purges.

https://gerrit.wikimedia.org/r/596463

Change 596721 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/core@master] WIP demo: DI for cdn purges

https://gerrit.wikimedia.org/r/596721

Change 599150 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] Enable kafka purges production on group0 wikis

https://gerrit.wikimedia.org/r/599150

Change 596463 merged by jenkins-bot:
[mediawiki/extensions/EventBus@master] Add EventBus backed EventRelayer for CDN purges.

https://gerrit.wikimedia.org/r/596463

Change 599150 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable kafka purges production on group0 wikis

https://gerrit.wikimedia.org/r/599150

Change 602125 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/deployment-charts@master] EventGate-main: allow resource-purge to be produced

https://gerrit.wikimedia.org/r/602125

Change 602125 merged by Ottomata:
[operations/deployment-charts@master] EventGate-main: allow resource-purge to be produced

https://gerrit.wikimedia.org/r/602125

Change 603514 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] Disable HTCP purges for test.wikipedia.org

https://gerrit.wikimedia.org/r/603514

Change 603530 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] Beta: Switch from HTCP purging to kafka purging

https://gerrit.wikimedia.org/r/603530

Change 603514 merged by jenkins-bot:
[operations/mediawiki-config@master] Disable HTCP purges for test.wikipedia.org

https://gerrit.wikimedia.org/r/603514

Mentioned in SAL (#wikimedia-operations) [2020-06-08T18:20:29Z] <catrope@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Disable HTCP purges for testwiki (T250781) (part 1) (duration: 00m 59s)

Mentioned in SAL (#wikimedia-operations) [2020-06-08T18:23:13Z] <catrope@deploy1001> Synchronized wmf-config/CommonSettings.php: Disable HTCP purges for testwiki (T250781) (part 2) (duration: 00m 56s)

Change 603649 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] [No-op]: Add precautions for kafka-purges before transition

https://gerrit.wikimedia.org/r/603649

Change 603654 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] Enable kafka purges everywhere.

https://gerrit.wikimedia.org/r/603654

Change 603655 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] Disbalse HTCP purges where kafka purges are enabled

https://gerrit.wikimedia.org/r/603655

Change 603675 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@master] Allow accepting multiple event types in EnableEventBus variable.

https://gerrit.wikimedia.org/r/603675

Change 603675 merged by jenkins-bot:
[mediawiki/extensions/EventBus@master] Allow accepting multiple event types in EnableEventBus variable.

https://gerrit.wikimedia.org/r/603675

Change 603649 merged by jenkins-bot:
[operations/mediawiki-config@master] [No-op]: Add precautions for kafka-purges before transition

https://gerrit.wikimedia.org/r/603649

Change 603654 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable kafka purges everywhere.

https://gerrit.wikimedia.org/r/603654

Change 604430 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] cache: make upload consume purges from kafka

https://gerrit.wikimedia.org/r/604430

Change 604430 merged by Ema:
[operations/puppet@production] cache: make upload consume purges from kafka

https://gerrit.wikimedia.org/r/604430

Mentioned in SAL (#wikimedia-operations) [2020-06-10T16:13:00Z] <ema> correction: restart purged on all *cache_upload* hosts to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/604430/ T250781 T133821

Change 603655 merged by jenkins-bot:
[operations/mediawiki-config@master] Disable HTCP purges where kafka purges are enabled

https://gerrit.wikimedia.org/r/603655

Big wikis are now done, but there's still a bit of long tail work left

Change 604469 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] Use EventRelayerNull for wikitech kafka purges

https://gerrit.wikimedia.org/r/604469

Change 604469 merged by jenkins-bot:
[operations/mediawiki-config@master] Use EventRelayerNull for wikitech kafka purges

https://gerrit.wikimedia.org/r/604469

Change 604743 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] purged: make Kafka cluster name configurable

https://gerrit.wikimedia.org/r/604743

Change 603530 merged by jenkins-bot:
[operations/mediawiki-config@master] Beta: Switch from HTCP purging to kafka purging

https://gerrit.wikimedia.org/r/603530

Change 604743 merged by Ema:
[operations/puppet@production] purged: make Kafka cluster name configurable

https://gerrit.wikimedia.org/r/604743

Change 607298 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] EventBus: Emit kafka purges for everything

https://gerrit.wikimedia.org/r/607298

Change 607300 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@master] Mark kafka purge events as EVENT_PURGE type

https://gerrit.wikimedia.org/r/607300

Change 607298 merged by jenkins-bot:
[operations/mediawiki-config@master] EventBus: Emit kafka purges for everything

https://gerrit.wikimedia.org/r/607298

Change 607300 merged by jenkins-bot:
[mediawiki/extensions/EventBus@master] Mark kafka purge events as EVENT_PURGE type

https://gerrit.wikimedia.org/r/607300

Change 607590 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] Enable kafka purges on wikitech

https://gerrit.wikimedia.org/r/607590

Change 607593 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] Disable HTCP purging everywhere

https://gerrit.wikimedia.org/r/607593

Change 607596 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] Cleanup: remove temporary wmgDisableHTCP variable

https://gerrit.wikimedia.org/r/607596

Change 607590 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable kafka purges on wikitech

https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/ /607590

ayounsi added a subscriber: ayounsi.Thu, Jul 2, 9:29 AM

Change 607593 merged by jenkins-bot:
[operations/mediawiki-config@master] Disable HTCP purging everywhere

https://gerrit.wikimedia.org/r/607593

Mentioned in SAL (#wikimedia-operations) [2020-07-08T18:17:19Z] <ppchelko@deploy1001> Synchronized wmf-config/reverse-proxy.php: Disable HTCP purging everywhere gerrit:607593 T250781 reverse-proxy.php (duration: 01m 04s)

Mentioned in SAL (#wikimedia-operations) [2020-07-08T18:18:41Z] <ppchelko@deploy1001> Synchronized wmf-config/wikitech.php: Disable HTCP purging everywhere gerrit:607593 T250781 wikitech.php (duration: 01m 04s)

Mentioned in SAL (#wikimedia-operations) [2020-07-08T18:20:20Z] <ppchelko@deploy1001> Synchronized wmf-config/CommonSettings.php: Disable HTCP purging everywhere gerrit:607593 T250781 CS.php (duration: 01m 03s)

Change 607596 merged by jenkins-bot:
[operations/mediawiki-config@master] Cleanup: remove temporary wmgDisableHTCP variable

https://gerrit.wikimedia.org/r/607596

Mentioned in SAL (#wikimedia-operations) [2020-07-08T18:27:12Z] <ppchelko@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Cleanup: remove temporary wmgDisableHTCP variable gerrit:607596 T250781 IS.php (duration: 01m 01s)

Pchelolo closed this task as Resolved.Wed, Jul 8, 6:27 PM
Pchelolo claimed this task.

After deploying the latest config changes and some cleanups all purges are now going via Kafka and there's zero HTCP packets received by purged: https://grafana.wikimedia.org/d/RvscY1CZk/purged?orgId=1&from=1594230979757&to=1594232779757&var-datasource=esams%20prometheus%2Fops&var-cluster=cache_text&var-instance=cp3050