Page MenuHomePhabricator

Massviews export data is incorrect (both CSV and JSON)
Closed, ResolvedPublic

Description

The Massviews export mechanism, both CSV format and JSON format, produces incorrect tabular data. The data is "left aligned" within the table: empty cells are removed and non-empty cells are exported in consequent order without gaps marking empty cells.

For example, let's assume we have two articles, A and B, where A was created January 1st, and B was created January 3rd. The pageviews data for both articles might be something like:

TitleJan 01Jan 02Jan 03Jan 04Jan 05Jan 06....
Article A05810126...
Article B01274...

where "–" means "no data available", that is, empty cell in the table, which is essentially "0". However, the exported CSV/JSON data is as follows:

TitleJan 01Jan 02Jan 03Jan 04Jan 05Jan 06....
Article A05810126...
Article B01274...

The problem is more severe when the pageviews API returns no data for a certain date. Instead of

TitleJan 01Jan 02Jan 03Jan 04Jan 05Jan 06....
Article A058106...
Article B01274...

we get the following table:

TitleJan 01Jan 02Jan 03Jan 04Jan 05Jan 06....
Article A058106...
Article B01274...

Event Timeline

MusikAnimal claimed this task.
MusikAnimal moved this task from Backlog to Done on the Tool-Pageviews board.
MusikAnimal subscribed.

Thanks for the report! This should now be fixed, and also for Langviews, Redirect Views and Userviews. Let me know if you have any other issues.

I should clarify: Now for older dates, if the data is not available, a zero is filled in, which is safe assumption due to the way the API works. However any blank values toward the end of the time series may be because the data is not available yet, so you may not necessarily want to treat them as 0. This issue can normally be avoided by querying for time ranges no later than say, 2 days before the present date.

Thank you @MusikAnimal for the fast fix!

BTW, I found the bug when I was trying to calculate weekly stats for a set of articles. Most articles have a weekly reading cycle, and I wanted to calculate weekly trend-line. It might be useful to provide weekly/monthly statistics, in addition to the daily statistics.

I could add monthly stats, weekly I'd have to compute client-side which I guess I could also do... but I should then add it to the main Pageviews app too. I'll create some tickets for this. Thanks again