I'm noticing discrepancies in Avg. daily pageviews metrics between our calculation and that of Pageview Analysis. These discrepancies appear in both “Avg. daily views to files uploaded” and the “Avg. daily pageviews” metrics.
The errors seem to be caused by the system using the wrong divisors to get averages. There may, however, be multiple errors in play, as the somewhat contradictory examples below suggest.
Examples
Avg. daily pageviews, Q61506256
The Pageview Analysis graph gives a clue as to what may be happening, If you look at it, pageviews occurred in only 3 of the last 30 days. The grand total of 14 pvs / 3 = 4 (ish). So it appears possible we are not dividing by the length of the entire period but only by the number of non-0 days.
Avg. daily views to files uploaded, for event Baldwin
"Baldwin" is a one-minute event with 1 uploaded file
- 'Avg. daily views to files uploaded' in our Event Summary report is 114
The Uploaded file is placed on two pages.
- List of schools in Patna has a Pageview Analysis daily avg of 113.
- Baldwin Academy has a Pageview Analysis average of 12—if you divide by the 4 days from page-creation on 3/1 to the report . But if you count back 30 days, including the 26 days previous to the page's creation, then the Daily avg. is 2.
113 + 2 = 115, which is pretty close to the Event Metrics figure of 114.
In conclusion...
As noted above, it looks like the errors are caused by using the wrong divisor to calculate averages. But the particulars of each case are quite different:
- In the "Avg. daily pageviews" case, the method appears to be dividing only by the number of non-zero days.
- In the "Avg. views to files uploaded" case, it looks like the problem is that the method doesn't recognize the page-creation date; it's dividing by the full 30 days, including many days of 0 pageviews—which on the surface looks like the opposite of the example above. (Of course, these examples could be providing red herrings, and it could be something completely different.)
What should happen
What should happen, for the record, is that for each unique article:
- If the article has existed > 30 days
- Fetch daily pageviews from the present back 30 days.
- Total those daily numbers
- Divide by 30
- If the article has existed < 30 days
- Determine how many days the article has existed.
- Fetch daily pageviews from the present back that many days.
- Total those daily numbers
- Divide by the number of days the article has existed to get the daily average.
Then, depending on what metric we're reporting, we either total all the daily averages to get the Event Summary grand dotal, or report individual daily averages for individual Pages Created or Pages Improved.
- If the article no longer exists I'm not sure what is possible of if you can figure this out
- If you can figure out the page is deleted, don't include that page in calculations at all. Or, for metrics that give individual article results, report the average as "n/a" for not applicable.
- If you can't, then just calculate the page average as above and over time the number will diminish. That's fine.