Page MenuHomePhabricator

Refactor EventBus mediawiki configuration
Open, Needs TriagePublic

Description

After we switch all the events to eventgate, we will have multiple config properties related to eventbus:

  • wmgUseEventBus - decides whether to enable EventBus service for a particular wiki. Currently only disabled for wikitech (perhaps incorrectly, cause wikitech DB name is labswiki, not wikitech)
  • wgEnableEventBus - a mapping on which event types are supported on a particular wiki.
  • wgEventServiceStreamConfig - per-event configuration. Currently only used to support the destination of the event, which after switching to eventgate completely will be either eventgate-main or eventgate-analytics. Not completely supported, analytics events that are going via monolog do not respect this parameter.

I believe we do not need all of these variables to control eventbus behavior.
Proposal:

  1. wmgUseEventBus - deprecate and remove in favor of TYPE_NONE in wgEnableEventBus
  2. wgEnableEventBus - keep.
  3. wgEventServiceStreamConfig - no really sure. I would be inclined to remove it, but perhaps it would be a good starting point for T205319

Details

Related Gerrit Patches:
operations/mediawiki-config : masterRemove EventBusRCFeedEngine eventServiceName.
mediawiki/extensions/EventBus : masterRefactor RCFeed configuration.

Event Timeline

Pchelolo created this task.Aug 5 2019, 6:36 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 5 2019, 6:36 PM

Ya I had hoped to use wgEventServiceStreamConfig for Stream Config as you say.

Could we instead get rid of the EventBus::TYPE_* stuff altogether, and just use per-stream config? I'm not sure I fully understand the reasoning behind the different TYPEs.

We'll need to support regex stream names in wgEventServiceStreamConfig anyway, like we already do in stream config / eventbus-topics.yaml. E.g.

wgEventServiceStreamConfig' => [
    'default' => [
        '/^mediawiki\.job\..+/' => [
            'EventServiceName' => 'eventgate-main',
            'schema_title' => 'mediawiki/job'
        ],
        // ...
    ],
    'wikitech' => [
        '/^.+/' => [
            // Disable all events for wikitech
            'EventServiceName' => false,
        ]
    ],
    'private' => [
        '/(?!mediawiki\.job\..)+/' => [
            // Disable all non mediawiki.job events for wikitech
            'EventServiceName' => false,
        ]
    ],
    // ...
]

Oof, but to do that we'd have to merge stream config rules for streams that match multiple regexes so we could disable all non-job streams, but still provide a way to configure individual non-job streams. Hm.

Pchelolo added a comment.EditedAug 5 2019, 7:21 PM

Could we instead get rid of the EventBus::TYPE_* stuff altogether, and just use per-stream config? I'm not sure I fully understand the reasoning behind the different TYPEs.

Ye, that's possible too. The reasoning behing EventBus::TYPE_* stuff is that, for example for private wikis change-prop is not supported as well as the events from private wikis must not end up in the public stream via eventstreams, so we only accept TYPE_JOB there. I've been thinking about your proposal as well, but didn't quite know whether we wanna go into the direction of regexes..

Oof, but to do that we'd have to merge stream config rules for streams that match multiple regexes so we could disable all non-job streams, but still provide a way to configure individual non-job streams. Hm.

heh.. That might quickly get out of hand... How do you specify the order of merges/overrides if multiple regexes match the same stream name? For example, if you want to state "all events are going to eventgate-main but job events are disabled and this particular event has the following configuration"? In theory this means that the most specific regex has precendence over a less specific regex, thus you need to do something like Object.assign(default_config, job_config, individual_config) - but how would you specify this? Doesn't look possible.

A possible solution could be to have a set of rules on which of the matching regexes/names to apply and only apply one. Like this:

  • If there's a specific (string) stream name matching in the config - use it
  • If there's a regex rule name matching in the config - use it. If there are multiple regex rule names - die with a horrible error.
  • If none applies - use the default

But this makes the config extremely fragile to multiple regexes matching a stream name, and probably will make us list most of the events explicitly making the config very large.

Overall I like the idea of regexes in principle, but we need to be very careful with it.

elukey edited projects, added Analytics; removed Analytics-Kanban.Aug 14 2019, 10:17 AM

Change 530446 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@master] Refactor RCFeed configuration.

https://gerrit.wikimedia.org/r/530446

Change 530446 merged by jenkins-bot:
[mediawiki/extensions/EventBus@master] Refactor RCFeed configuration.

https://gerrit.wikimedia.org/r/530446

elukey moved this task from Incoming to Radar on the Analytics board.Aug 19 2019, 3:34 PM

Change 534236 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/mediawiki-config@master] Remove EventBusRCFeedEngine eventServiceName.

https://gerrit.wikimedia.org/r/534236

Ottomata moved this task from Backlog to Next Up on the Event-Platform board.Sep 5 2019, 5:03 PM

Change 534236 merged by jenkins-bot:
[operations/mediawiki-config@master] Remove EventBusRCFeedEngine eventServiceName.

https://gerrit.wikimedia.org/r/534236

Mentioned in SAL (#wikimedia-operations) [2019-09-10T18:15:52Z] <jforrester@deploy1001> Synchronized wmf-config/CommonSettings.php: T229863 Remove EventBusRCFeedEngine eventServiceName (duration: 01m 05s)