Page MenuHomePhabricator

Provide MediaWiki timestamps in Hive-refined EventLogging tables via UDF
Closed, ResolvedPublic3 Estimated Story Points

Description

EventLogging data as it has been most commonly used in recent years, i.e. in the form of MySQL/ MariaDB tables, contains a human-readable timestamp field in the MediaWiki format ( yyyymmddhhmmss ). As discussed in T179540: Timestamp format in Hive-refined EventLogging tables is incompatible with MySQL version, the new Hive-refined EL data ended up using epoch timestamps instead, but for reasons of downward compatibility and to facilitate joins, we still need to be able to use MediaWiki timestamps too.
In T179540#3742635, @Ottomata proposed that the Analytics Engineering team could provide a UDF for this, "something along the lines of SELECT MediawikiTimestamp(dt) ...."

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Nuria renamed this task from Provide MediaWiki timestamps in Hive-refined EventLogging tables to Provide MediaWiki timestamps in Hive-refined EventLogging tables via UDF.Feb 5 2018, 5:30 PM
Nuria assigned this task to fdans.
Nuria moved this task from Incoming to Wikistats on the Analytics board.
Nuria moved this task from Wikistats to Operational Excellence Future on the Analytics board.
Nuria edited projects, added Analytics-Kanban; removed Analytics.
Nuria subscribed.

Maybe @fdans can work on this one after the launch of the maps on wikistats?

Change 408567 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery/source@master] Add GetMediawikiTimestampUDF to refinery-hive

https://gerrit.wikimedia.org/r/408567

JAllemandou set the point value for this task to 3.
JAllemandou added a subscriber: fdans.

Change 408567 merged by jenkins-bot:
[analytics/refinery/source@master] Add GetMediawikiTimestampUDF to refinery-hive

https://gerrit.wikimedia.org/r/408567