Page MenuHomePhabricator

A few alterblocks events have event_timestamps from before 2001
Open, MediumPublic


mediawiki_history includes 70 events with timestamps before 2001, all of them alterblocks events. For a full list, see this spreadsheet. In addition to the incorrect timestamps, the events all have null event_user_id and event_user_text (even though this should be filled in with the details of the blocking user) and null event_comment (even though many of the blocks had comments).

All of the users concerned seem to have a weird block of one of two types:

Event Timeline

@matmarex mentioned that this is probably related to the fact that you can provide a string as the block length and PHP's date math does something weird with it

Some examples I was thinking about when I said that:

Garbage input to date parsing functions can result in blocks with expirations in the past (usually around 1970 ;) ) or millennia in the future.

I don't know what kind of issues this is causing for you, but I would advise not worrying about it :)

(Your spreadsheet is not accessible btw)

Thank your for the extra context!

I don't know what kind of issues this is causing for you, but I would advise not worrying about it :)

Well, there's no issue with the expiration times being weird and long or even in the past (which is just reality); it's that those weird expiration times seem to cause this dataset to incorrectly state the time when the block was applied. And indeed, the number of affected rows is very small, but looking at it might illuminate other issues in the data generation process (not to mention save data users from having to constantly add where event_timestamp > "2001" to queries 🙂).

(Your spreadsheet is not accessible btw)

Fixed, sorry!

Thanks @Neil_P._Quinn_WMF and @matmarex :)
This error is indeed related to wrong filtering of weird block-expirations.
A patch fixing this and more should come in the relatively near future, as there are other issues with user-alterblocks events.

fdans triaged this task as Medium priority.Mar 25 2019, 3:59 PM
fdans moved this task from Incoming to Data Quality on the Analytics board.

I haven't have time to fix this with this bunch of changes. Keeping it in backlog of things to do for mediawiki_history.