Page MenuHomePhabricator

Monthly aggregate endpoint returns unexpected results and invalid timestamp
Closed, ResolvedPublic

Description

Example per-article monthly data for Jan and Feb 2016:
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/user/Barack_Obama/monthly/2016010100/2016030100

The response gives timestamps with the 1st day of the month in YYYYMMDDHH format, e.g. 2016010100 and 2016020100

Example aggregate monthly data for Jan and Feb 2016:
https://wikimedia.org/api/rest_v1/metrics/pageviews/aggregate/fr.wikipedia.org/all-access/user/monthly/2016010100/2016030100

The response gives back an invalid timestamp, with zero as the day of the month, 2016010000 and 2016020000. The response timestamp should match what is considered valid for the request, e.g. if I use 2016010000 and 2016020000 I get an error:
https://wikimedia.org/api/rest_v1/metrics/pageviews/aggregate/fr.wikipedia.org/all-access/user/monthly/2016010000/2016020000

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
MusikAnimal renamed this task from Invalid timestamp returned from monthly aggregate endpoint to Monthly aggregate endpoint returns invalid timestamp.Jan 25 2017, 10:04 PM

And actually, it looks like the aggregate monthly endpoint is also returning the wrong months. I used January 1st through March 1st to get data for January and February. Here's that on per-article:
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/user/Barack_Obama/monthly/2016010100/2016030100

But for aggregate I get February and March:
https://wikimedia.org/api/rest_v1/metrics/pageviews/aggregate/fr.wikipedia.org/all-access/user/monthly/2016010100/2016030100

MusikAnimal renamed this task from Monthly aggregate endpoint returns invalid timestamp to Monthly aggregate endpoint returns unexpected results and invalid timestamp.Jan 25 2017, 10:12 PM

Thanks for ping, Will look into it, hopefully next week.

First thing I'll do is to load test data on beta so we can actually test fix.

Let's translate: 201001000000 (not really a date) to 201001010000

  • create new namespace
  • load table in new name space
  • swap name space on aqs

Change 338898 had a related patch set uploaded (by Fdans):
Add secondary sys endpoint to populate Cassandra with correct timestamps

https://gerrit.wikimedia.org/r/338898

Change 338898 merged by Fdans:
Add secondary table endpoint to populate Cassandra with correct timestamps

https://gerrit.wikimedia.org/r/338898

Change 340093 had a related patch set uploaded (by Fdans):
Use v2 table in Cassandra, switch to padded day timestamp

https://gerrit.wikimedia.org/r/340093

Change 340093 merged by Joal:
[analytics/refinery] Use v2 table in Cassandra, switch to padded day timestamp

https://gerrit.wikimedia.org/r/340093

Change 342205 had a related patch set uploaded (by Fdans):
[analytics/refinery] Change keyspace name in per project cassandra oozie job

https://gerrit.wikimedia.org/r/342205

Change 342486 had a related patch set uploaded (by Fdans):
[analytics/aqs] Switch table pointer to v2 in per project endpoint

https://gerrit.wikimedia.org/r/342486

Change 342486 merged by Fdans:
[analytics/aqs] Switch table pointer to v2 in per project endpoint

https://gerrit.wikimedia.org/r/342486

Change 342876 had a related patch set uploaded (by Fdans):
[analytics/aqs/deploy] Change keyspace name to project_v2 in fake data script

https://gerrit.wikimedia.org/r/342876

Change 342876 merged by Joal:
[analytics/aqs/deploy] Change keyspace name to project_v2 in fake data script

https://gerrit.wikimedia.org/r/342876

Change 342205 merged by Joal:
[analytics/refinery] Change keyspace name in per project cassandra oozie job

https://gerrit.wikimedia.org/r/342205

@MusikAnimal : we are still doing some code on hive to load data with appropriate dates but on your end you should see the date issue having been fixed. Please do check, note that for monthly data dates need to be inclusive so https://wikimedia.org/api/rest_v1/metrics/pageviews/aggregate/fr.wikipedia.org/all-access/user/monthly/2016010100/2016030100 will not return data for March.