Page MenuHomePhabricator

Configure purged in deployment-prep
Closed, ResolvedPublic

Description

@Pchelolo mentioned on irc that it would be useful to have purged working in deployment-prep. We need to make sure that purged is running instead of vhtcpd, and choose which kafka topics it should subscribe to. When it comes to multicast HTCP purges, the same old configuration used by vhtcpd should work fine.

Related Objects

Event Timeline

ema created this task.Jun 9 2020, 7:10 AM
Restricted Application added a project: Operations. · View Herald TranscriptJun 9 2020, 7:10 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ema triaged this task as Medium priority.Jun 9 2020, 7:11 AM
ema moved this task from Triage to Caching on the Traffic board.

Mentioned in SAL (#wikimedia-operations) [2020-06-09T07:11:45Z] <ema> deployment-cache-text06: stop vhtcpd, start purged T254844

ema added a comment.Jun 9 2020, 7:14 AM

purged is now running in deployment-prep instead of vhtcpd:

ema@deployment-cache-text06:~$ systemctl status purged.service 
● purged.service - Purger for ATS and Varnish
   Loaded: loaded (/lib/systemd/system/purged.service; static; vendor preset: enabled)
   Active: active (running) since Tue 2020-06-09 06:54:01 UTC; 18min ago
 Main PID: 25882 (purged)
    Tasks: 6 (limit: 4699)
   CGroup: /system.slice/purged.service
           └─25882 /usr/bin/purged -backend_addr 127.0.0.1:3128 -frontend_addr 127.0.0.1:3127 -mcast_addrs 239.128.0.112 -prometheus_addr :2112 -frontend_workers 4 -backend_workers 2

I have double-checked that multicast HTCP purges are working fine:

ema@deployment-cache-text06:~$ varnishncsa -n frontend -q 'ReqMethod eq "PURGE"'
127.0.0.1 - - [09/Jun/2020:07:12:18 +0000] "PURGE http://en.m.wikipedia.beta.wmflabs.org/wiki/TemplateUsageArticle736 HTTP/1.1" 204 0 "-" "purged"
127.0.0.1 - - [09/Jun/2020:07:12:18 +0000] "PURGE http://en.wikipedia.beta.wmflabs.org/wiki/TemplateUsageArticle736 HTTP/1.1" 204 0 "-" "purged"
127.0.0.1 - - [09/Jun/2020:07:12:18 +0000] "PURGE http://en.m.wikipedia.beta.wmflabs.org/w/index.php?title=TemplateUsageArticle736&action=history HTTP/1.1" 204 0 "-" "purged"
127.0.0.1 - - [09/Jun/2020:07:12:18 +0000] "PURGE http://en.wikipedia.beta.wmflabs.org/w/index.php?title=TemplateUsageArticle736&action=history HTTP/1.1" 204 0 "-" "purged"

@Pchelolo: let me know which kafka topics we should read from in deployment-prep!

Aklapper renamed this task from Configure purged in depoloyment-prep to Configure purged in deployment-prep.Jun 9 2020, 7:42 AM

@Pchelolo: let me know which kafka topics we should read from in deployment-prep!

@ema - it would be eqiad.resource-purge

Change 604072 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] purged: kafka topic configuration for beta

https://gerrit.wikimedia.org/r/604072

Change 604072 merged by Ema:
[operations/puppet@production] purged: kafka topic configuration for beta

https://gerrit.wikimedia.org/r/604072

Change 604743 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] purged: make Kafka cluster name configurable

https://gerrit.wikimedia.org/r/604743

Change 604790 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] purged: fix Kafka brokers TCP port if TLS is disabled

https://gerrit.wikimedia.org/r/604790

Change 604790 merged by Ema:
[operations/puppet@production] purged: fix Kafka brokers TCP port if TLS is disabled

https://gerrit.wikimedia.org/r/604790

ema added a comment.Jun 11 2020, 5:00 PM

I have cherry-picked https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/604743/ on deployment-puppetmaster04 and added profile::cache::purge::kafka_cluster_name: main-deployment-prep to hieradata on horizon.

Now the purged-kafka configuration on both deployment-cache-upload06 and deployment-cache-text06 looks good to me:

{
  "client.id":                "purged",
  "bootstrap.servers":        "deployment-kafka-main-1.deployment-prep.eqiad.wmflabs:9092,deployment-kafka-main-2.deployment-prep.eqiad.wmflabs:9092",
  "statistics.interval.ms":   60000,
  "compression.codec":        "snappy",
  "group.id":                 "deployment-cache-upload06",
  "go.events.channel.enable": true
}

After a purged restart on both instances I have edited a page on en.wikipedia.beta.wmflabs.org and indeed the kafka purges are coming in:

ema@deployment-cache-text06:~$ curl -s localhost:2112/metrics | grep ^rdkafka_consumer_rxmsgs
rdkafka_consumer_rxmsgs{client_id="purged"} 34

However, by looking at the actual PURGE requests generated by purged, it seems that we're only sending both kafka and multicast purges for MediaWiki, and not RestBASE. Here's the count of PURGES received per URL:

1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/html/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/html/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n/429192
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/media-list/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/media-list/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n/429192
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-html/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-html/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n/429192
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-sections/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-sections/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n/429192
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-sections-lead/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-sections-lead/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n/429192
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-sections-remaining/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-sections-remaining/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n/429192
1 http://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/summary/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n
2 http://en.m.wikipedia.beta.wmflabs.org/wiki/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n
2 http://en.m.wikipedia.beta.wmflabs.org/w/index.php?title=Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n&action=history
2 http://en.wikipedia.beta.wmflabs.org/wiki/Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n
2 http://en.wikipedia.beta.wmflabs.org/w/index.php?title=Conflict-title-0.17049093751716504-I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n&action=history
ema added a comment.Jun 11 2020, 5:20 PM

However, by looking at the actual PURGE requests generated by purged, it seems that we're only sending both kafka and multicast purges for MediaWiki, and not RestBASE. Here's the count of PURGES received per URL:

Correction: reality is the other way round. RestBASE purges are coming in via Kafka, but not via multicast. See P11473 for a kafkacat capture including RB purges.

Change 604743 merged by Ema:
[operations/puppet@production] purged: make Kafka cluster name configurable

https://gerrit.wikimedia.org/r/604743

ema closed this task as Resolved.Jun 12 2020, 8:46 AM
ema claimed this task.

Both deployment-cache-text06 and deployment-cache-upload06 are now reading purges from Kafka. Closing!