Page MenuHomePhabricator

EventBusRCFeedFormatter should clean up events from nulls
Open, NormalPublic

Description

In the new generation of MediaWiki event schemas we want to restric the schemas to have monomorphic types everywhere, thus require that a certain property could only have a single possible type. The only place where it's not true right now is in the recentchange event - we allow the properties to be null. This is not consistent with the rest of the schemas, where null properties are just absent from the event. We need to make recentchange events conform to this bt clearing out the null properties.

Event Timeline

Restricted Application added a project: Analytics. · View Herald TranscriptFeb 19 2019, 10:33 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Ah! Nice idea.

Milimetric triaged this task as High priority.Feb 21 2019, 5:35 PM
Milimetric lowered the priority of this task from High to Normal.
Milimetric moved this task from Incoming to Modern Event Platform on the Analytics board.

Change 492047 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@master] Remove properties with null values from the recentchange event.

https://gerrit.wikimedia.org/r/492047

Change 492047 merged by Ppchelko:
[mediawiki/extensions/EventBus@master] Remove properties with null values from the recentchange event.

https://gerrit.wikimedia.org/r/492047

Ottomata moved this task from Backlog to Done on the EventBus board.Mar 4 2019, 10:30 PM

Change 495760 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@master] Correctly delete nulls from recentchange event and add test.

https://gerrit.wikimedia.org/r/495760

Change 495760 merged by jenkins-bot:
[mediawiki/extensions/EventBus@master] Correctly delete nulls from recentchange event and add test.

https://gerrit.wikimedia.org/r/495760

After the patch was deployed we do not have nulls in recent change schema anymore, however we still can not declare victory and get rid of all of the polymorphic types in the schema. The log_params can be either an object or an array and, judging by the code, it can actually be a non-empty array in rare cases. Not sure what to do about that.

Since we'll be able to support map types with new schemas (once we get that stuff running in Hadoop), we might be able to just always convert log_params to a map in eventbus extension. Can we just always somehow make this a simple string: string map?