failed events
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Mstyles
	Dec 7 2020, 7:22 PM

Description

As a production service, the flink streaming updater job needs somewhere to send events that are late/failed/spurious. Currently these go to HDFS but apps in the Kubernetes cluster can't talk to the Analytics cluster. There needs to be a Kafka bridge that the rdf-streaming-updater can send events to, and those events will eventually end up in HDFS.

Acceptance Criteria:
Kafka topics are set up for late/failed/spurious events
Those same events end up in HDFS

Details

Subject	Repo	Branch	Lines +/-
Set canary_events_enabled: true for rdf-streaming-updater streams	operations/mediawiki-config	master	+3 -0
Do not produce canary events for rdf-streaming-updater streams	operations/mediawiki-config	master	+3 -4
[wdqs] Add flink sideoutput stream definitions	operations/mediawiki-config	master	+16 -0
Add json encoders for side output events	wikidata/query/rdf	master	+814 -273
Add rdf-streaming-updater schemas for side outputs	schemas/event/secondary	master	+1 K -0

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	Gehel	T244590 [Epic] Rework the WDQS updater as an event driven application
Resolved	Gehel	T264006 Deploy Flink (rdf-streaming-updater) to kubernetes (k8s)
Resolved	dcausse	T269619 Create pipelines for late/spurious/failed events
Resolved	Ottomata	T273901 Automate event stream ingestion into HDFS for streams that don't use EventGate

Event Timeline

Mstyles created this task.Dec 7 2020, 7:22 PM

@Ottomata also suggested via IRC to consider using the event platform instead of kafka

dcausse moved this task from Incoming to In Progress on the Discovery-Search (Current work) board.Dec 9 2020, 2:45 PM

CBogen moved this task from Incoming to Current work on the Wikidata-Query-Service board.Dec 14 2020, 4:22 PM

Change 647723 had a related patch set uploaded (by DCausse; owner: DCausse):
[schemas/event/secondary@master] Add rdf-streaming-updater schemas for side outputs

https://gerrit.wikimedia.org/r/647723

gerritbot added a project: Patch-For-Review.Dec 15 2020, 2:48 PM

Change 649715 had a related patch set uploaded (by DCausse; owner: DCausse):
[wikidata/query/rdf@master] [WIP] Add json encoders for side output events

https://gerrit.wikimedia.org/r/649715

@dcausse for retrieving schemas, https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master might help. :)

In T269619#6693136, @Ottomata wrote:

@dcausse for retrieving schemas, https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master might help. :)

Thanks!
it worked like a charm but I still had to pull com.github.java-json-tools:json-schema-validator:2.2.14 to do schema validation, would it make sense to add some helper functions to wikimedia-event-utilities for validating a json against its schema? Use-case is a unit test to make sure that the json produced is compliant with the schema it's referencing.

would it make sense to add some helper functions to wikimedia-event-utilities for validating a json against its schema

Sure that could be useful :)

@dcausse, will these be POSTed to an EventGate, or to produced directly to Kafka?

In T269619#6695454, @Ottomata wrote:

@dcausse, will these be POSTed to an EventGate, or to produced directly to Kafka?

I plan to POST them to event gate using a very naive SinkFunction and see how it behaves but I can push directly to kafka if it's preferable?

It depends on what you want to do :) EventGate will handle multi DC, filling some default values, and topic prefixes for you, but is an extra hop to Kafka. As a prod system in a language with a good Kafka client, producing to Kafka is totally allowed. You'd be the first main user of event platform not going through EventGate, but it is definitely something we want to support.

Perhaps we'll want to build in some logic for doing some of the things EventGate is doing (as a proxy) into wikimedia-event-utilties (including validation, as you suggested).

In T269619#6696071, @Ottomata wrote:

It depends on what you want to do :) EventGate will handle multi DC, filling some default values, and topic prefixes for you, but is an extra hop to Kafka. As a prod system in a language with a good Kafka client, producing to Kafka is totally allowed. You'd be the first main user of event platform not going through EventGate, but it is definitely something we want to support.

Perhaps we'll want to build in some logic for doing some of the things EventGate is doing (as a proxy) into wikimedia-event-utilties (including validation, as you suggested).

I think the most important for me is that these events get stored in HDFS properly and I'm not sure what are the requirements for this. I went with event-gate because I know it works :)

But using the flink kafka producers makes more sense, it's robust and easy to use but I was worried of missing important parts of the process. If this kind of usecases is something we'd like to support then I think I should go with this approach, I created T270371 to discuss all the tools that wikimedia-event-utilties should provide to help this usecase.

T270371 is great thank you!

If this kind of usecases is something we'd like to support

We do. EventGate is a poor substitute for a full fledged Kafka client. Kafka has so many more features and knobs to dial, and for prod apps those are going to be important.

dcausse moved this task from In Progress to Needs review on the Discovery-Search (Current work) board.Jan 15 2021, 2:09 PM

Mstyles mentioned this in T273095: Deploy Helm Chart.Jan 27 2021, 6:24 PM

Change 647723 merged by Ottomata:
[schemas/event/secondary@master] Add rdf-streaming-updater schemas for side outputs

https://gerrit.wikimedia.org/r/647723

Change 661727 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/mediawiki-config@master] [wdqs] Add flink sideoutput stream definitions

https://gerrit.wikimedia.org/r/661727

Ottomata mentioned this in T273901: Automate event stream ingestion into HDFS for streams that don't use EventGate.Feb 4 2021, 4:15 PM

Mentioned in SAL (#wikimedia-operations) [2021-02-08T15:11:47Z] <ottomata> set kafka topic retention to 31 days for (eqiad|codfw.rdf-streaming-updater.mutation) in kafka main-eqiad and main-codfw - T269619

Mentioned in SAL (#wikimedia-analytics) [2021-02-08T15:11:53Z] <ottomata> set kafka topic retention to 31 days for (eqiad|codfw.rdf-streaming-updater.mutation) in kafka main-eqiad and main-codfw - T269619

Change 649715 merged by jenkins-bot:
[wikidata/query/rdf@master] Add json encoders for side output events

https://gerrit.wikimedia.org/r/649715

Change 661727 merged by jenkins-bot:
[operations/mediawiki-config@master] [wdqs] Add flink sideoutput stream definitions

https://gerrit.wikimedia.org/r/661727

Mentioned in SAL (#wikimedia-operations) [2021-02-10T12:26:44Z] <dcausse@deploy1001> Synchronized wmf-config/InitialiseSettings.php: T269619: [wdqs] Add flink sideoutput stream definitions (duration: 01m 06s)

Maintenance_bot removed a project: Patch-For-Review.Feb 10 2021, 1:10 PM

Change 663219 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Do not produce canary events for rdf-streaming-updater streams

https://gerrit.wikimedia.org/r/663219

gerritbot added a project: Patch-For-Review.Feb 10 2021, 3:21 PM

Change 663219 merged by jenkins-bot:
[operations/mediawiki-config@master] Do not produce canary events for rdf-streaming-updater streams

https://gerrit.wikimedia.org/r/663219

Mentioned in SAL (#wikimedia-operations) [2021-02-10T15:26:25Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Do not produce canary events for rdf-streaming-updater streams - T269619 (duration: 01m 13s)

Maintenance_bot removed a project: Patch-For-Review.Feb 10 2021, 4:10 PM

dcausse moved this task from Needs review to Needs Reporting on the Discovery-Search (Current work) board.Feb 15 2021, 4:42 PM

Gehel closed this task as Resolved.Feb 24 2021, 1:27 PM

Change 668119 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Set canary_events_enabled: true for rdf-streaming-updater streams

https://gerrit.wikimedia.org/r/668119

gerritbot added a project: Patch-For-Review.Mar 3 2021, 4:15 PM

Change 668119 merged by jenkins-bot:
[operations/mediawiki-config@master] Set canary_events_enabled: true for rdf-streaming-updater streams

https://gerrit.wikimedia.org/r/668119

@Gehel @dcausse these events are now in HDFS. There aren't any Hive tables yet because no non-canary events have yet been ingested, and we filter out canary events from the Hive tables.

Ottomata closed subtask T273901: Automate event stream ingestion into HDFS for streams that don't use EventGate as Resolved.Jul 27 2021, 7:21 PM

Create pipelines for late/spurious/failed eventsClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Create pipelines for late/spurious/failed events
Closed, ResolvedPublic
Actions

Related Objects
Search...