Page MenuHomePhabricator

Generate $wgEventLoggingSchemas from $wgEventStreams
Open, MediumPublic

Description

$wgEventLoggingStreamNames allows us to filter out configurations for streams that aren't published to by the EventLogging JavaScript client, thereby reducing bytes sent. However, it introduces friction when deploying new streams (and/or instruments) and can also drift when contributors shift focus.

However, since $wgEventLoggingStreamNames is a subset of the set of stream configurations, $wgEventStreams, it can be automatically generated given some convention. For example, let's say that if an event stream has the event_logging_js producer, then it is added to $wgEventLoggingStreamNames:

$wgEventLoggingStreamNames = array_keys( array_filter(
  $wgEventStreams,
  fn ( $eventStream ) =>
    isset( $eventStream[ 'producers' ] ) ? isset( $eventStream[ 'producers' ][ 'event_logging_js' ] ) : false
) );

The above could be implemented in operations/mediawiki-config or the EventLogging extension.

Pros:

  • We still minimise bytes sent
  • It Just Works™ - contributors/EventLogging maintainers no longer need to remember to update $wgEventLoggingStreamNames

Cons:

  • The above would run for every request served by the app servers

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Ottomata: I'd appreciate your thoughts on this, whenever you have a spare moment.

Okay so I do have thoughts! We made $wgEventLoggingStreamNames before we talked about and decided to do https://wikitech.wikimedia.org/wiki/Event_Platform/Stream_Configuration#consumers_and_producers (see also https://phabricator.wikimedia.org/T273235#6879386).

So, instead of having a separate config for EvenLogging only, we should just add 'mediawiki_eventlogging_client' or something in the producers section of the stream's config. You've seen this for metrics platform stuff, right? So I guess whatever the producer is called for that. And, EL can then filter for event streams that have that producer set or enabled. We might have to make sure the EventStreamConfig API can support this kind of filtering, if it doesn't already.

Also relevant:

OH WAIT, this is exactly what you are proposing! But without bothering to make the EventStreamConfig API do it. Okay great!

EChetty triaged this task as High priority.Mar 24 2022, 3:20 PM
EChetty lowered the priority of this task from High to Medium.
EChetty moved this task from Incoming to Ingest on the Data-Engineering board.

I think doing having EventLogging code do this with filtering via EventStreamConfig API using producers config is the better way to go. Its how other clients will filter to get their streams too.

I think doing having EventLogging code do this with filtering via EventStreamConfig API using producers config is the better way to go. Its how other clients will filter to get their streams too.

Sure! We'll have to patch StreamConfig::matchesSettings to a recursive, associative array intersection, which should be easy enough.

Wouldn't it be nice of EventStreamConfig was a GraphQL API? :p

Sure! We'll have to patch StreamConfig::matchesSettings to a recursive, associative array intersection, which should be easy enough.

I suppose alternatively, you could just do this filtering in the EventLogging extension. Get all stream configs from the PHP API, and THEN filter for producers == eventlogging.

Change 790632 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/extensions/EventStreamConfig@master] Support matching nested constraints

https://gerrit.wikimedia.org/r/790632

Change 790746 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/extensions/EventStreamConfig@master] api: Support passing deep constraints

https://gerrit.wikimedia.org/r/790746

@Ottomata Look like this one is resolved. Should I change the status to resolved?

Code has not been merged or deployed, but looks ready to be. @phuedx ?

Thanks for the ping!

I added a few more test cases to each patch having discovered a minor bug in the first patch that I submitted. I've also rebased them both. They're certainly ready to be reviewed and (hopefully) merged!

Change 790632 merged by jenkins-bot:

[mediawiki/extensions/EventStreamConfig@master] Support matching nested constraints

https://gerrit.wikimedia.org/r/790632

Change 790746 merged by jenkins-bot:

[mediawiki/extensions/EventStreamConfig@master] api: Support passing deep constraints

https://gerrit.wikimedia.org/r/790746