After work on T340880, we now consume the hourly partitions that are generated from the mediawiki.revision-visibility-change event stream. This Airflow job stalled recently waiting on partition datacenter=eqiad/year=2023/month=9/day=17/hour=4. This partition never materialized as there were 0 events in that hour range.
On a recent slack thread, we discovered that canary events are not being produced for mediawiki.revision-visibility-change:
'mediawiki.revision-visibility-change' => [ 'schema_title' => 'mediawiki/revision/visibility-change', 'destination_event_service' => 'eventgate-main', 'canary_events_enabled' => false, ],
As per @Ottomata, this stream predates the introduction of canary events, and thus it doesn't have it enabled. But for having T340880 work reliably, we now want these canary events to happen.
In this task we want to:
- Enable canary events for mediawiki.revision-visibility-change for the reliable consumption of the downstream HDFS table.
- Given this stream is likely consumed by other folks, we need to announce the change and explain to folks how to filter out these events.