Page MenuHomePhabricator

Allow disabling/enabling configured streams via wgEventStreams config
Open, HighPublic

Description

Right now there's a potentially confusing/messy way to do partial (per-wiki) deployment of streams. "Potentially" because we don't have enough anecdotal evidence to suggest one way or another, but over time as the Event Platform is used more for instrumentation and developers see what the pain points in their workflow and instruments' lifecycle are, we'll be able to see if partial deployment of streams is an area for improvement.

The best approach relies on configuring the stream in default/metawiki, but then registering it with EventLogging (via $wgEventLoggingStreamNames) for the specific wikis you want it enabled on. (See https://wikitech.wikimedia.org/wiki/Event_Platform/Instrumentation_How_To#Per-wiki_configurations for more details.) Alternatively it's possible to deploy a stream everywhere with sampling.rate: 0.0 and set sampling.rate: 1.0 on the specific wikis you want it enabled on, but that's bad for performance reasons.

If it turns out that the EventLogging registration-approach is too confusing in practice, we could add a universally recognized setting (that is, EventLogging's submit() and Event Platform Clients for iOS/Android have logic to handle it) to disable a stream. Something like enabled: false or is_active: false

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
mpopov triaged this task as Low priority.Aug 5 2020, 1:42 PM

This is low priority until we observe how partial stream deployment goes in practice.

This would also be useful for eventgate and event ingestion logic to determine if a stream should be totally disabled. If metawiki API has is_active: false, eventgate could reject all incoming events for that stream (as if it wasn't configured), and/or our Hadoop ingestion logic could skip ingesting that data.

This would also be useful for eventgate and event ingestion logic to determine if a stream should be totally disabled.

That does sound really useful! What would you say is the priority/need for something like this on ingestion side of things?

Also, we should definitely coordinate on this and roll out support for it on EventGate/Hadoop ingestion logic and clients simultaneously.

Very low, right now it would be equivalent to just removing the stream config entry.

T284620 is about removing a stream and possibly deleting its schema. We should revisit this ticket, and also consider more broadly what is the end of life process for a stream (and maybe schema).

odimitrijevic raised the priority of this task from Low to High.Jun 14 2021, 3:41 PM
odimitrijevic moved this task from Incoming to Event Platform on the Analytics board.

Hm, I wonder, if instead of doing a top level 'enabled: false' setting in stream config, we make this a client producer specific setting. Arguably, the 'destination_event_service' setting is a producer specific setting too (only EventBus extension uses it).

How about:

EventStreams:
  my_stream_name:
    producers:
      mediawiki_eventbus:
        enabled: false

We should probably move destination_event_service under mediawiki_eventbus too, but that is out of scope for this ticket.

A top level enabled or is_active setting could be useful though, if we want to fully disable a stream, at the producer as well as consumers level. Hm.

Per-producer and per-consumer settings are acceptable because producers and consumers should be isolated from one another as much as possible. I do think a per-stream setting that overrides the producer/consumer-level setting is also acceptable – the alternative is to remove and re-add the stream config, which we'd have to require all producers and consumers be resilient to.

I will add that I don't think that producers and consumers necessarily have to be aware of the top-level setting. We could make the EventStreamConfigs extension override the producer/consumer-level settings automatically.

We could make the EventStreamConfigs extension override the producer/consumer-level settings automatically.

Oh, interesting idea. Like, if producers or consumers are defined, and we want to disable everything, we can set a top level enabled: false, and make EventStreamConfig dynamically insert enabled: false at each defined producer and consumer level setting? that's a cool idea.

I've gone ahead and moved the setting for EventBus into stream_name.producers.mediawiki_eventbus.enabled.

We could make the EventStreamConfigs extension override the producer/consumer-level settings automatically.

Like, if producers or consumers are defined, and we want to disable everything, we can set a top level enabled: false, and make EventStreamConfig dynamically insert enabled: false at each defined producer and consumer level setting?

Yes :)

Ottomata edited projects, added Event-Platform; removed Event-Platform (Sprint 04).

Huh, I don't think this task was ever totally done. We made it possible to disable a stream at the EventBus producer level. We need to document this and make it possible to do so at the top leve.

Reopening.