Page MenuHomePhabricator

Switch webrequest dataset to feed from HAProxy instead of VarnishKafka
Closed, ResolvedPublic

Event Timeline

Change #1131387 had a related patch set uploaded (by Joal; author: Joal):

[operations/puppet@production] Update analytics webrequest kafkatee

https://gerrit.wikimedia.org/r/1131387

Change #1131387 merged by Btullis:

[operations/puppet@production] Update analytics webrequest kafkatee

https://gerrit.wikimedia.org/r/1131387

Change #1131405 had a related patch set uploaded (by Joal; author: Joal):

[operations/puppet@production] Update hadoop-test webrequest gobblin/purge jobs

https://gerrit.wikimedia.org/r/1131405

Change #1131652 had a related patch set uploaded (by Joal; author: Joal):

[analytics/refinery@master] Add webrequest_frontend_test gobblin job

https://gerrit.wikimedia.org/r/1131652

Change #1131652 merged by Joal:

[analytics/refinery@master] Add webrequest_frontend_test gobblin job

https://gerrit.wikimedia.org/r/1131652

Change #1131405 merged by Btullis:

[operations/puppet@production] Update hadoop-test webrequest gobblin/purge jobs

https://gerrit.wikimedia.org/r/1131405

Change #1132663 had a related patch set uploaded (by Joal; author: Joal):

[operations/alerts@master] Update data-eng gobblin alert

https://gerrit.wikimedia.org/r/1132663

Change #1133068 had a related patch set uploaded (by Joal; author: Joal):

[analytics/refinery@master] Update webrequest schemas for HAProxy migration

https://gerrit.wikimedia.org/r/1133068

Change #1133068 merged by Joal:

[analytics/refinery@master] Update webrequest schemas for HAProxy migration

https://gerrit.wikimedia.org/r/1133068

I'm going to adjust retention of the following topics:

  • webrequest_text goes from 7d to 3d
  • webrequest_upload goes from 7d to 3d
  • webrequest_frontend_text goes from 3d to 7d
  • webrequest_frontend_upload goes from 3d to 7d

All good?

brouberol@kafka-jumbo1014:~$ kafka configs --alter --entity-type topics --entity-name webrequest_frontend_text --delete-config retention.ms
kafka-configs --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --alter --entity-type topics --entity-name webrequest_frontend_text --delete-config retention.ms
Completed Updating config for entity: topic 'webrequest_frontend_text'.
brouberol@kafka-jumbo1014:~$ kafka topics --describe --topic webrequest_frontend_text | head
kafka-topics --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --describe --topic webrequest_frontend_text
Topic:webrequest_frontend_text	PartitionCount:256	ReplicationFactor:3	Configs:message.timestamp.type=LogAppendTime
	Topic: webrequest_frontend_text	Partition: 0	Leader: 1012	Replicas: 1012,1013,1009	Isr: 1009,1012,1013
	Topic: webrequest_frontend_text	Partition: 1	Leader: 1014	Replicas: 1014,1010,1009	Isr: 1014,1010,1009

I deleted the retention.ms override on webrequest_frontend_text, which caused it it use the server-side default of

brouberol@kafka-jumbo1014:~$ sudo grep retention /etc/kafka/server.properties
log.retention.hours=168

which is 7d.

brouberol@kafka-jumbo1014:~$ kafka configs --alter --entity-type topics --entity-name webrequest_frontend_upload --delete-config retention.ms
kafka-configs --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --alter --entity-type topics --entity-name webrequest_frontend_upload --delete-config retention.ms
Completed Updating config for entity: topic 'webrequest_frontend_upload'.

The webrequest_{text,upload} topics now have a retention of 3d.

brouberol@kafka-jumbo1014:~$ kafka configs --alter --entity-type topics --entity-name webrequest_text --add-config retention.ms=259200000
kafka-configs --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --alter --entity-type topics --entity-name webrequest_text --add-config retention.ms=259200000
Completed Updating config for entity: topic 'webrequest_text'.
brouberol@kafka-jumbo1014:~$ kafka configs --alter --entity-type topics --entity-name webrequest_upload --add-config retention.ms=259200000
kafka-configs --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --alter --entity-type topics --entity-name webrequest_upload --add-config retention.ms=259200000
Completed Updating config for entity: topic 'webrequest_upload'.
brouberol@kafka-jumbo1014:~$ kafka topics --describe --topic webrequest_text | head -n 4
kafka-topics --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --describe --topic webrequest_text
Topic:webrequest_text	PartitionCount:24	ReplicationFactor:3	Configs:message.timestamp.type=LogAppendTime,retention.ms=259200000
	Topic: webrequest_text	Partition: 0	Leader: 1013	Replicas: 1013,1008,1011	Isr: 1011,1013,1008
	Topic: webrequest_text	Partition: 1	Leader: 1008	Replicas: 1008,1012,1015	Isr: 1012,1015,1008
brouberol@kafka-jumbo1014:~$ kafka topics --describe --topic webrequest_upload | head -n 4
kafka-topics --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --describe --topic webrequest_upload
Topic:webrequest_upload	PartitionCount:24	ReplicationFactor:3	Configs:message.timestamp.type=LogAppendTime,retention.ms=259200000
	Topic: webrequest_upload	Partition: 0	Leader: 1010	Replicas: 1010,1014,1008	Isr: 1014,1010,1008
	Topic: webrequest_upload	Partition: 1	Leader: 1012	Replicas: 1012,1013,1007	Isr: 1012,1013,1007

Let the bonfire begin.

Change #1133084 had a related patch set uploaded (by Joal; author: Joal):

[analytics/refinery@master] Hotfix for webrequest migration

https://gerrit.wikimedia.org/r/1133084

Change #1133084 merged by Joal:

[analytics/refinery@master] Hotfix for webrequest migration

https://gerrit.wikimedia.org/r/1133084

Change #1132663 merged by jenkins-bot:

[operations/alerts@master] Update data-eng gobblin alert

https://gerrit.wikimedia.org/r/1132663

Change #1133847 had a related patch set uploaded (by Joal; author: Joal):

[operations/alerts@master] Update GobblinLastSuccessfulRunTooLongAgo

https://gerrit.wikimedia.org/r/1133847

Change #1133847 merged by jenkins-bot:

[operations/alerts@master] Update GobblinLastSuccessfulRunTooLongAgo

https://gerrit.wikimedia.org/r/1133847

Change #1187450 had a related patch set uploaded (by Joal; author: Joal):

[operations/puppet@production] Fix raw webrequest data purge job

https://gerrit.wikimedia.org/r/1187450

Change #1187450 merged by Stevemunene:

[operations/puppet@production] Fix raw webrequest data purge job

https://gerrit.wikimedia.org/r/1187450