Page MenuHomePhabricator

Refactor EventBus mediawiki configuration
Closed, ResolvedPublic

Description

After we switch all the events to eventgate, we will have multiple config properties related to eventbus:

  • wmgUseEventBus - decides whether to enable EventBus service for a particular wiki. Currently only disabled for wikitech (perhaps incorrectly, cause wikitech DB name is labswiki, not wikitech)
  • wgEnableEventBus - a mapping on which event types are supported on a particular wiki.
  • wgEventServiceStreamConfig - per-event configuration. Currently only used to support the destination of the event, which after switching to eventgate completely will be either eventgate-main or eventgate-analytics. Not completely supported, analytics events that are going via monolog do not respect this parameter.
  • wgEventStreams used by the new EventStreamConfig extension for dynamic configuration of streams.

I believe we do not need all of these variables to control eventbus behavior.
Proposal:

  1. wmgUseEventBus - deprecate and remove in favor of TYPE_NONE in wgEnableEventBus
  2. wgEnableEventBus - keep.
  3. wgEventServiceStreamConfig - no really sure. I would be inclined to remove this in favor of wgEventStreams.

See also T205319: Modern Event Platform: Stream Configuration

Event Timeline

Ya I had hoped to use wgEventServiceStreamConfig for Stream Config as you say.

Could we instead get rid of the EventBus::TYPE_* stuff altogether, and just use per-stream config? I'm not sure I fully understand the reasoning behind the different TYPEs.

We'll need to support regex stream names in wgEventServiceStreamConfig anyway, like we already do in stream config / eventbus-topics.yaml. E.g.

wgEventServiceStreamConfig' => [
    'default' => [
        '/^mediawiki\.job\..+/' => [
            'EventServiceName' => 'eventgate-main',
            'schema_title' => 'mediawiki/job'
        ],
        // ...
    ],
    'wikitech' => [
        '/^.+/' => [
            // Disable all events for wikitech
            'EventServiceName' => false,
        ]
    ],
    'private' => [
        '/(?!mediawiki\.job\..)+/' => [
            // Disable all non mediawiki.job events for wikitech
            'EventServiceName' => false,
        ]
    ],
    // ...
]

Oof, but to do that we'd have to merge stream config rules for streams that match multiple regexes so we could disable all non-job streams, but still provide a way to configure individual non-job streams. Hm.

Could we instead get rid of the EventBus::TYPE_* stuff altogether, and just use per-stream config? I'm not sure I fully understand the reasoning behind the different TYPEs.

Ye, that's possible too. The reasoning behing EventBus::TYPE_* stuff is that, for example for private wikis change-prop is not supported as well as the events from private wikis must not end up in the public stream via eventstreams, so we only accept TYPE_JOB there. I've been thinking about your proposal as well, but didn't quite know whether we wanna go into the direction of regexes..

Oof, but to do that we'd have to merge stream config rules for streams that match multiple regexes so we could disable all non-job streams, but still provide a way to configure individual non-job streams. Hm.

heh.. That might quickly get out of hand... How do you specify the order of merges/overrides if multiple regexes match the same stream name? For example, if you want to state "all events are going to eventgate-main but job events are disabled and this particular event has the following configuration"? In theory this means that the most specific regex has precendence over a less specific regex, thus you need to do something like Object.assign(default_config, job_config, individual_config) - but how would you specify this? Doesn't look possible.

A possible solution could be to have a set of rules on which of the matching regexes/names to apply and only apply one. Like this:

  • If there's a specific (string) stream name matching in the config - use it
  • If there's a regex rule name matching in the config - use it. If there are multiple regex rule names - die with a horrible error.
  • If none applies - use the default

But this makes the config extremely fragile to multiple regexes matching a stream name, and probably will make us list most of the events explicitly making the config very large.

Overall I like the idea of regexes in principle, but we need to be very careful with it.

Change 530446 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@master] Refactor RCFeed configuration.

https://gerrit.wikimedia.org/r/530446

Change 530446 merged by jenkins-bot:
[mediawiki/extensions/EventBus@master] Refactor RCFeed configuration.

https://gerrit.wikimedia.org/r/530446

Change 534236 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] Remove EventBusRCFeedEngine eventServiceName.

https://gerrit.wikimedia.org/r/534236

Change 534236 merged by jenkins-bot:
[operations/mediawiki-config@master] Remove EventBusRCFeedEngine eventServiceName.

https://gerrit.wikimedia.org/r/534236

Mentioned in SAL (#wikimedia-operations) [2019-09-10T18:15:52Z] <jforrester@deploy1001> Synchronized wmf-config/CommonSettings.php: T229863 Remove EventBusRCFeedEngine eventServiceName (duration: 01m 05s)

I just ran into an issue that made me think about this again. It'd be really convenient if all stream config was centralized and available via the MW API. We are in the process of putting EventLogging legacy and new event instrumentation stream configs into $wgEventStreams, but the production stream config is currently locked away in static eventgate helmfiles. In addition, the EventBus extension needs to know what EventServiceName to send a event stream to...Or does it?

EventBus currently has a getInstanceForStream factory method that uses wgEventServiceStreamConfig to lookup the EventServiceName that a an event stream should be sent to. We used this in the past to do the migration from eventlogging-service-eventbus to eventgate-main. Currently it is not used; all streams use the default EventServiceName 'eventgate-main'. EventBus is used to produce monolog events to eventgate-analytics, but this EventBus instance is manually configured in wmf-config/logging.php.

I'm thinking about how to move wgEventServiceStreamConfig into wgEventStreams, but perhaps we don't have to? We could remove support for per stream EventServiceName config from EventBus, and then we don't need EventBus to use stream config at all. It might be nice in the future to be able to (e.g. if we had to do another migration), but right now supporting it is a bit awkward. It might be easier to remove that support and build it back in later using EventStreamConfig extension if/when we need it. We could keep wgEnableEventBus with the TYPE_* stuff for now and reconsider that later if we need to.

So, one thing that's touching the code around it is: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventBus/+/590471 - not very related, just fyi.

Hm.. So, you want to leave only EventBus::getInstance( $name = null ) and have 'eventgate-main' be the default name. Where would we configure the default name? We can hard-code 'main' vs 'analytics' in the code and configure the uris for them, but that's kinda duplication of event_type thingy.

We have the code to select appropriate section of the wgEventStreams config right? We can leave the option to put the service_name there. So, all the code in EventBus will still be calling EventBus::getInstanceForStream. Right now we will just have some wgEventBusDefaultServiceName = 'eventgate-main' configured, but if we need to, we would reuse the code from StreamConfig extension in EventBus to select the proper section and will be able to fetch a service_name from there?

So, you want to leave only EventBus::getInstance( $name = null ) and have 'eventgate-main' be the default name. Where would we configure the default name?

The default name in the code can remain 'eventbus', I would just make a single config to pick the name, instead of per-stream. E.g. $wgEventServiceName = 'eventgate-main' in mediawiki-config somewhere.

We have the code to select appropriate section of the wgEventStreams config right? We can leave the option to put the service_name there. So, all the code in EventBus will still be calling EventBus::getInstanceForStream.

Hm, so use EventStreamConfig now in getInstanceForStream? Hmmmmmm. Might work. Will give it a try. If I do I'll base it on the ServiceWiring patch.

Change 594505 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[mediawiki/extensions/EventBus@master] Remove wgEventServiceStreamConfig in favor of wgEventStreams using EventStreamConfig

https://gerrit.wikimedia.org/r/594505

Change 601381 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[integration/config@master] EventBus now depends on EventStreamConfig

https://gerrit.wikimedia.org/r/601381

Change 601381 merged by jenkins-bot:
[integration/config@master] EventBus now depends on EventStreamConfig

https://gerrit.wikimedia.org/r/601381

Mentioned in SAL (#wikimedia-releng) [2020-06-01T17:47:45Z] <James_F> Zuul: EventBus now depends on EventStreamConfig T229863

Change 601384 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[integration/config@master] EventBus should have EventStreamConfig in dependencies

https://gerrit.wikimedia.org/r/601384

Mentioned in SAL (#wikimedia-releng) [2020-06-01T20:31:58Z] <James_F> Zuul: EventBus now depends on EventStreamConfig for non-phan too T229863

Change 601384 merged by jenkins-bot:
[integration/config@master] EventBus should have EventStreamConfig in dependencies

https://gerrit.wikimedia.org/r/601384

Restricted Application edited projects, added Analytics; removed Analytics-Radar. · View Herald TranscriptJun 10 2020, 6:33 AM
Restricted Application edited projects, added Analytics; removed Analytics-Radar. · View Herald TranscriptJun 10 2020, 6:36 AM

Change 610160 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Add wgEventServiceDefault to refactor EventBus event stream config

https://gerrit.wikimedia.org/r/610160

Change 610160 merged by jenkins-bot:
[operations/mediawiki-config@master] Add wgEventServiceDefault to refactor EventBus event stream config

https://gerrit.wikimedia.org/r/610160

Mentioned in SAL (#wikimedia-operations) [2020-07-08T18:04:44Z] <ppchelko@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Add wgEventServiceDefault to refactor EventBus event stream config gerrit:610160 T229863 (duration: 01m 04s)

Mentioned in SAL (#wikimedia-operations) [2020-07-08T18:11:28Z] <ppchelko@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Add wgEventServiceDefault to refactor EventBus event stream config gerrit:610160 T229863, IS.php (duration: 01m 03s)

Change 594505 merged by jenkins-bot:
[mediawiki/extensions/EventBus@master] Remove wgEventServiceStreamConfig in favor of wgEventStreams using EventStreamConfig

https://gerrit.wikimedia.org/r/594505

Change 616569 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[mediawiki/extensions/EventBus@master] Fix comment doc in EventBusRCFeedEngine.php about wgEventStreams

https://gerrit.wikimedia.org/r/616569

Change 616570 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Remove now unused wgEventServiceStreamConfig

https://gerrit.wikimedia.org/r/616570

Change 616569 merged by jenkins-bot:
[mediawiki/extensions/EventBus@master] Fix comment doc in EventBusRCFeedEngine.php about wgEventStreams

https://gerrit.wikimedia.org/r/616569

Change 616570 merged by Ottomata:
[operations/mediawiki-config@master] Remove now unused wgEventServiceStreamConfig

https://gerrit.wikimedia.org/r/616570

Mentioned in SAL (#wikimedia-operations) [2020-08-04T15:18:57Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Remove now unused wgEventServiceStreamConfig - T229863 (duration: 00m 58s)