Page MenuHomePhabricator

Wikistats Bug differing view numbers
Open, Needs TriagePublic

Description

Monthly pageviews give strongly different results, dependinh where I look.

  1. https://stats.wikimedia.org/#/sw.wikipedia.org/reading/total-page-views/normal|bar|2016-02-29~2021-11-08|~total|monthly shows Oct 2021 Total 6.265.709
  1. https://pageviews.toolforge.org/siteviews/?platform=all-access&source=pageviews&agent=user&start=2021-10&end=2021-10&sites=sw.wikipedia.org shows sw.wikipedia.org · 2021-10-01 - 2021-10-31 · 1.848.591 pageviews
  1. When I go via https://stats.wikimedia.org/#/sw.wikipedia.org/reading/page-views-by-country/normal|table|last-month|(access)~desktop*mobile-app*mobile-web|monthly and add the numbers in the table, I get 1.9 Mil. views, which corresponds to 2.), although I am on the same adress like 1.) which gives the much larger figure.

Is it a mistake or just a matter of better labeling?

Event Timeline

JAllemandou added a subscriber: JAllemandou.

Hi @Kipala, thanks for reporting :)
The reason for your number of views to be different in 1) is because they include user, spider and automated traffic, while numbers in 2) only contain numbers for users.
You can filter the numbers in the stats.wikimedia.org UI by using the 'Agent Type' filtering (see https://stats.wikimedia.org/#/sw.wikipedia.org/reading/total-page-views/normal|bar|2016-02-29~2021-11-08|agent~user|monthly).
Finally nubmers in 3) are close to 2) cause they contain user traffic only, but they differ as they are an approximation of the real value (rounded to the upper closest hundred value). You can read some more here: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Geoeditors/Public#No_exact_counts.
I'm gonna close this task as invalid, feel free to reopen as needed :)

Hi JAllemandou,

I suspected something like this. Then it looks to me badly labelled. It
should say on the page what you just told me ("include user, spider and
automated traffic").

But the explanation on the page  sound like the opposite "Page views on
Wikimedia projects count the viewing of article content. In this data we
try to separate bot traffic and focus on human user page views."

So this explanation line should definitively be corrected. Can you
accept that?

Cheers

Ingo

Am 08/11/2021 um 19:37 schrieb JAllemandou:

View Task https://phabricator.wikimedia.org/T295298
JAllemandou closed this task as "Invalid".
JAllemandou added a comment.

Hi @Kipala https://phabricator.wikimedia.org/p/Kipala/, thanks for
reporting :)
The reason for your number of views to be different in 1) is because
they include user, spider and automated traffic, while numbers in 2)
only contain numbers for users.
You can filter the numbers in the stats.wikimedia.org UI by using the
'Agent Type' filtering (see
https://stats.wikimedia.org/#/sw.wikipedia.org/reading/total-page-views/normal|bar|2016-02-29~2021-11-08|agent~user|monthly
https://stats.wikimedia.org/#/sw.wikipedia.org/reading/total-page-views/normal|bar|2016-02-29~2021-11-08|agent~user|monthly).
Finally nubmers in 3) are close to 2) cause they contain user traffic
only, but they differ as they are an approximation of the real value
(rounded to the upper closest hundred value). You can read some more
here:
https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Geoeditors/Public#No_exact_counts
https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Geoeditors/Public#No_exact_counts.
I'm gonna close this task as invalid, feel free to reopen as needed :)

*TASK DETAIL*
https://phabricator.wikimedia.org/T295298

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *JAllemandou
*Cc: *JAllemandou, Kipala, Aklapper, EChetty, Dachosen2800,
4748kitoko, Saimongoltini, Jsc39, Akovalyov, Bsandipan,
GoranSMilovanovic, TheDaveRoss, terrrydactyl, jeblad, Nemo_bis, jeremyb

This email has been checked for viruses by AVG.
https://www.avg.com

Hi JAllemandou,
I suspected something like this. Then it looks to me badly labelled. It should say on the page what you just told me ("include user, spider and automated traffic").
But the explanation on the page sound like the opposite "Page views on Wikimedia projects count the viewing of article content. In this data we try to separate bot traffic and focus on human user page views."
So this explanation line should definitively be corrected. Can you accept that?

Change 737430 had a related patch set uploaded (by Milimetric; author: Milimetric):

[analytics/wikistats2@master] Correct pageview data description

https://gerrit.wikimedia.org/r/737430

Thanks for pointing this out, @Kipala, I definitely agree with you the description was wrong. Would you mind taking a look at this proposal and letting us know if it's clear:

https://gerrit.wikimedia.org/r/c/analytics/wikistats2/+/737430/1/src/i18n/en.json

Looks good! Thanks.

I would appreciate if it could be clarified for each pageview page, what
is shown. I had stumbled over the fact, that "total page views" showed a
high number, but "page views by country" a lower number,

so what about "top viewed articles"?

Cheers

Ingo

Am 08/11/2021 um 20:44 schrieb Milimetric:

View Task https://phabricator.wikimedia.org/T295298
Milimetric added a comment.

Thanks for pointing this out, @Kipala
https://phabricator.wikimedia.org/p/Kipala/, I definitely agree with
you the description was wrong. Would you mind taking a look at this
proposal and letting us know if it's clear:

https://gerrit.wikimedia.org/r/c/analytics/wikistats2/+/737430/1/src/i18n/en.json
https://gerrit.wikimedia.org/r/c/analytics/wikistats2/+/737430/1/src/i18n/en.json

*TASK DETAIL*
https://phabricator.wikimedia.org/T295298

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *Milimetric
*Cc: *JAllemandou, Kipala, Aklapper, 786, EChetty, Suran38, Biggs657,
Lalamarie69, Dachosen2800, Juan90264, Alter-paule, Beast1978, Un1tY,
4748kitoko, Hook696, Saimongoltini, Kent7301, Jsc39, joker88john,
CucyNoiD, Akovalyov, Gaboe420, Giuliamocci, Cpaulf30, Af420,
Bsandipan, GoranSMilovanovic, Lewizho99, Maathavan, TheDaveRoss,
terrrydactyl, jeblad, Nemo_bis, jeremyb

This email has been checked for viruses by AVG.
https://www.avg.com

Change 737430 merged by jenkins-bot:

[analytics/wikistats2@master] Correct pageview data description

https://gerrit.wikimedia.org/r/737430

Change 742764 had a related patch set uploaded (by Milimetric; author: Milimetric):

[analytics/wikistats2@master] Link to AQS documentation instead of Research page

https://gerrit.wikimedia.org/r/742764

Change 742764 merged by Razzi:

[analytics/wikistats2@master] Link to AQS documentation instead of Research page

https://gerrit.wikimedia.org/r/742764