Page MenuHomePhabricator

Allow disabling/enabling configured streams via wgEventStreams config
Open, LowPublic

Description

Right now there's a potentially confusing/messy way to do partial (per-wiki) deployment of streams. "Potentially" because we don't have enough anecdotal evidence to suggest one way or another, but over time as the Event Platform is used more for instrumentation and developers see what the pain points in their workflow and instruments' lifecycle are, we'll be able to see if partial deployment of streams is an area for improvement.

The best approach relies on configuring the stream in default/metawiki, but then registering it with EventLogging (via $wgEventLoggingStreamNames) for the specific wikis you want it enabled on. (See https://wikitech.wikimedia.org/wiki/Event_Platform/Instrumentation_How_To#Per-wiki_configurations for more details.) Alternatively it's possible to deploy a stream everywhere with sampling.rate: 0.0 and set sampling.rate: 1.0 on the specific wikis you want it enabled on, but that's bad for performance reasons.

If it turns out that the EventLogging registration-approach is too confusing in practice, we could add a universally recognized setting (that is, EventLogging's submit() and Event Platform Clients for iOS/Android have logic to handle it) to disable a stream. Something like enabled: false or is_active: false

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
mpopov triaged this task as Low priority.Aug 5 2020, 1:42 PM

This is low priority until we observe how partial stream deployment goes in practice.

This would also be useful for eventgate and event ingestion logic to determine if a stream should be totally disabled. If metawiki API has is_active: false, eventgate could reject all incoming events for that stream (as if it wasn't configured), and/or our Hadoop ingestion logic could skip ingesting that data.

This would also be useful for eventgate and event ingestion logic to determine if a stream should be totally disabled.

That does sound really useful! What would you say is the priority/need for something like this on ingestion side of things?

Also, we should definitely coordinate on this and roll out support for it on EventGate/Hadoop ingestion logic and clients simultaneously.

Very low, right now it would be equivalent to just removing the stream config entry.