Page MenuHomePhabricator

`rev_parent_id` and `rev_content_changed` are missing in event.mediawiki_revision_tags_change
Closed, ResolvedPublic

Description

Different from the schema https://github.com/wikimedia/mediawiki-event-schemas/blob/master/jsonschema/mediawiki/revision/tags-change/1.yaml, rev_parent_id and rev_content_changed are missing in event.mediawiki_revision_tags_change table. Can we add these fields and backfill the values for them?

Event Timeline

chelsyx created this task.Mar 14 2019, 1:22 AM
chelsyx moved this task from Triage to Tracking on the Product-Analytics board.

@Tgr and @Pchelolo can you look at why mediawiki is not emitting those fields? Backfilling is too expensive, but you can join to event.mediawiki_revision_create because that schema has both rev_parent_id and rev_content_changed for all revisions.

Pchelolo added a comment.EditedMar 18 2019, 3:40 PM

I think that the schema is incorrect here.

Changing the tags for the revision cannot modify the revision contents, so the existence of the rev_content_changed property in the schema is a mistake.

Regarding rev_parent_id - we just forgot to add it, will fix.

Milimetric moved this task from Incoming to Radar on the Analytics board.Mar 18 2019, 3:48 PM

Change 497327 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/event-schemas@master] Removed rev_content_changed property from revision-tags-change schema.

https://gerrit.wikimedia.org/r/497327

Change 497358 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@master] Set rev_parent_id for all revision-related events.

https://gerrit.wikimedia.org/r/497358

Change 497358 merged by jenkins-bot:
[mediawiki/extensions/EventBus@master] Set rev_parent_id for all revision-related events.

https://gerrit.wikimedia.org/r/497358

Change 497327 merged by Ppchelko:
[mediawiki/event-schemas@master] Removed rev_content_changed property from revision-tags-change schema.

https://gerrit.wikimedia.org/r/497327

Pchelolo closed this task as Resolved.Mar 19 2019, 2:25 PM
Pchelolo claimed this task.

The rev_content_changed has been removed from the schema and after the train we will ensure rev_parent_id is present in all the events. Resolving.

Thanks all!

Nuria added a subscriber: Nuria.Mar 19 2019, 7:41 PM

Super thanks @Pchelolo for looking at this so fast