Page MenuHomePhabricator

Page properties-change event is rejected if page was deleted
Closed, ResolvedPublic

Description

The mediawiki.properties-change event schema requires that a page_id property minimum is 1 However, the event is posted asynchronously from the JobQueue, so if the page was deleted before the actual posting of the event, the page_id returned by the Title is 0 and the event gets rejected.

A question is whether we need to stretch the schema to allow 0, or should we just avoid sending the properties-change event altogether if the page doesn't exist any more by the time the event is about to be sent? What do you think @Ottomata?

Event Timeline

Pchelolo created this task.Feb 21 2017, 8:34 PM
Restricted Application added a project: Analytics. · View Herald TranscriptFeb 21 2017, 8:34 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Not strongly opinionated, but for posterity purposes it sounds better to capture the event rather than not emit it. So dumb that page_ids get set to 0 when a page is deleted, but ooohhhh well. I think we should modify the schema to allow 0. That will also be a simpler change.

Isn't the better question here: how does the event affect the page? I.e. if the page has already been deleted by the time the event fires, is it still taken into account? In the same vein, because the page is deleted by the time the event enters the EventBus pipeline, would any subscribers to the event actually be able to do something with it, given that MW would not respond to any requests for the given title? If the event itself does not affect anything, then I'd vote for keeping the requirement and keeping irrelevant events out of the pipeline.

Change 339024 had a related patch set uploaded (by Ppchelko):
Use LinksUpdate member for page_id

https://gerrit.wikimedia.org/r/339024

@mobrovac So, I've found the way to get that original page_id for the page (see Gerrit) patch. On one hand, I agree that firing an event for a deleted page might not be useful, but just discarding it seems a bit scary. We are not really sure what properties might've been changed and whether it's 100% safe to discard it.. I'm on the edge here..

What if the page is deleted, the properties are changed, and then later the page is restored?

What if the page is deleted, the properties are changed, and then later the page is restored?

Yes, that is my question too: will the page pros event still affect it in MW or not? We don't seem to know for sure, so let's investigate?

@mobrovac So, I've found the way to get that original page_id for the page (see Gerrit) patch. On one hand, I agree that firing an event for a deleted page might not be useful, but just discarding it seems a bit scary. We are not really sure what properties might've been changed and whether it's 100% safe to discard it.. I'm on the edge here..

Oh, cool @Pchelolo ! But still, I think we need to answer the above question before any action is taken.

What if the page is deleted, the properties are changed, and then later the page is restored?

Yes, that is my question too: will the page pros event still affect it in MW or not? We don't seem to know for sure, so let's investigate?

Hm, it seems the action will be taken. I don't see any code that prevents the DB writes from happening if the page was deleted, and it's not quite easy to test..

Milimetric moved this task from Incoming to Radar on the Analytics board.Feb 23 2017, 4:42 PM

We are not really sure what properties might've been changed and whether it's 100% safe to discard it

...

Hm, it seems the action will be taken

Even if it is safe to discard it, we probably shouldn't, especially if we can get the original page_id without much fuss. This may have implications for the MW History project if we ever decide to augment that dataset with live events. So yaaa, let's keep it! :)

Change 339024 merged by jenkins-bot:
[mediawiki/extensions/EventBus] Use LinksUpdate member for page_id

https://gerrit.wikimedia.org/r/339024

Patch merged. Moving to 'done' until it gets deployed.

Pchelolo closed this task as Resolved.Mar 30 2017, 2:35 PM

Deployed, verified, resolving.

Aklapper edited projects, added Analytics-Radar; removed Analytics.Jun 10 2020, 6:44 AM