Page MenuHomePhabricator

Wikistats Bug differing view numbers
Closed, ResolvedPublic

Description

Monthly pageviews give strongly different results, dependinh where I look.

  1. https://stats.wikimedia.org/#/sw.wikipedia.org/reading/total-page-views/normal|bar|2016-02-29~2021-11-08|~total|monthly shows Oct 2021 Total 6.265.709
  1. https://pageviews.toolforge.org/siteviews/?platform=all-access&source=pageviews&agent=user&start=2021-10&end=2021-10&sites=sw.wikipedia.org shows sw.wikipedia.org · 2021-10-01 - 2021-10-31 · 1.848.591 pageviews
  1. When I go via https://stats.wikimedia.org/#/sw.wikipedia.org/reading/page-views-by-country/normal|table|last-month|(access)~desktop*mobile-app*mobile-web|monthly and add the numbers in the table, I get 1.9 Mil. views, which corresponds to 2.), although I am on the same adress like 1.) which gives the much larger figure.

Is it a mistake or just a matter of better labeling?

Event Timeline

JAllemandou subscribed.

Hi @Kipala, thanks for reporting :)
The reason for your number of views to be different in 1) is because they include user, spider and automated traffic, while numbers in 2) only contain numbers for users.
You can filter the numbers in the stats.wikimedia.org UI by using the 'Agent Type' filtering (see https://stats.wikimedia.org/#/sw.wikipedia.org/reading/total-page-views/normal|bar|2016-02-29~2021-11-08|agent~user|monthly).
Finally nubmers in 3) are close to 2) cause they contain user traffic only, but they differ as they are an approximation of the real value (rounded to the upper closest hundred value). You can read some more here: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Geoeditors/Public#No_exact_counts.
I'm gonna close this task as invalid, feel free to reopen as needed :)

Hi JAllemandou,

I suspected something like this. Then it looks to me badly labelled. It
should say on the page what you just told me ("include user, spider and
automated traffic").

But the explanation on the page  sound like the opposite "Page views on
Wikimedia projects count the viewing of article content. In this data we
try to separate bot traffic and focus on human user page views."

So this explanation line should definitively be corrected. Can you
accept that?

Cheers

Ingo

Am 08/11/2021 um 19:37 schrieb JAllemandou:

View Task https://phabricator.wikimedia.org/T295298
JAllemandou closed this task as "Invalid".
JAllemandou added a comment.

Hi @Kipala https://phabricator.wikimedia.org/p/Kipala/, thanks for
reporting :)
The reason for your number of views to be different in 1) is because
they include user, spider and automated traffic, while numbers in 2)
only contain numbers for users.
You can filter the numbers in the stats.wikimedia.org UI by using the
'Agent Type' filtering (see
https://stats.wikimedia.org/#/sw.wikipedia.org/reading/total-page-views/normal|bar|2016-02-29~2021-11-08|agent~user|monthly
https://stats.wikimedia.org/#/sw.wikipedia.org/reading/total-page-views/normal|bar|2016-02-29~2021-11-08|agent~user|monthly).
Finally nubmers in 3) are close to 2) cause they contain user traffic
only, but they differ as they are an approximation of the real value
(rounded to the upper closest hundred value). You can read some more
here:
https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Geoeditors/Public#No_exact_counts
https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Geoeditors/Public#No_exact_counts.
I'm gonna close this task as invalid, feel free to reopen as needed :)

*TASK DETAIL*
https://phabricator.wikimedia.org/T295298

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *JAllemandou
*Cc: *JAllemandou, Kipala, Aklapper, EChetty, Dachosen2800,
4748kitoko, Saimongoltini, Jsc39, Akovalyov, Bsandipan,
GoranSMilovanovic, TheDaveRoss, terrrydactyl, jeblad, Nemo_bis, jeremyb

This email has been checked for viruses by AVG.
https://www.avg.com

Hi JAllemandou,
I suspected something like this. Then it looks to me badly labelled. It should say on the page what you just told me ("include user, spider and automated traffic").
But the explanation on the page sound like the opposite "Page views on Wikimedia projects count the viewing of article content. In this data we try to separate bot traffic and focus on human user page views."
So this explanation line should definitively be corrected. Can you accept that?

Change 737430 had a related patch set uploaded (by Milimetric; author: Milimetric):

[analytics/wikistats2@master] Correct pageview data description

https://gerrit.wikimedia.org/r/737430

Thanks for pointing this out, @Kipala, I definitely agree with you the description was wrong. Would you mind taking a look at this proposal and letting us know if it's clear:

https://gerrit.wikimedia.org/r/c/analytics/wikistats2/+/737430/1/src/i18n/en.json

Looks good! Thanks.

I would appreciate if it could be clarified for each pageview page, what
is shown. I had stumbled over the fact, that "total page views" showed a
high number, but "page views by country" a lower number,

so what about "top viewed articles"?

Cheers

Ingo

Am 08/11/2021 um 20:44 schrieb Milimetric:

View Task https://phabricator.wikimedia.org/T295298
Milimetric added a comment.

Thanks for pointing this out, @Kipala
https://phabricator.wikimedia.org/p/Kipala/, I definitely agree with
you the description was wrong. Would you mind taking a look at this
proposal and letting us know if it's clear:

https://gerrit.wikimedia.org/r/c/analytics/wikistats2/+/737430/1/src/i18n/en.json
https://gerrit.wikimedia.org/r/c/analytics/wikistats2/+/737430/1/src/i18n/en.json

*TASK DETAIL*
https://phabricator.wikimedia.org/T295298

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *Milimetric
*Cc: *JAllemandou, Kipala, Aklapper, 786, EChetty, Suran38, Biggs657,
Lalamarie69, Dachosen2800, Juan90264, Alter-paule, Beast1978, Un1tY,
4748kitoko, Hook696, Saimongoltini, Kent7301, Jsc39, joker88john,
CucyNoiD, Akovalyov, Gaboe420, Giuliamocci, Cpaulf30, Af420,
Bsandipan, GoranSMilovanovic, Lewizho99, Maathavan, TheDaveRoss,
terrrydactyl, jeblad, Nemo_bis, jeremyb

This email has been checked for viruses by AVG.
https://www.avg.com

Change 737430 merged by jenkins-bot:

[analytics/wikistats2@master] Correct pageview data description

https://gerrit.wikimedia.org/r/737430

Change 742764 had a related patch set uploaded (by Milimetric; author: Milimetric):

[analytics/wikistats2@master] Link to AQS documentation instead of Research page

https://gerrit.wikimedia.org/r/742764

Change 742764 merged by Razzi:

[analytics/wikistats2@master] Link to AQS documentation instead of Research page

https://gerrit.wikimedia.org/r/742764

Verified that the changes are now live. @Kipala, the approach we decided to go for was to link directly to the metric definitions. Previously, the pageview detail pages were linking to the general Pageview definition page. With this change now live, they link to the relevant sections of the metric documentation page. We did it this way so as not to force lots of re-translation of the descriptions. When we have some more time to put into this, we'll draft much better descriptions and send those for translation. Let me know what you think, and if there's anything else that needs clearing up. The description of the total pageviews metric was changed though, since that was super confusing.

Thanks for that. Now it is a bit more work to get an idea, but not
misleadinga ny more.

Thanks!!

Kipala

Am 08/12/2021 um 17:48 schrieb Milimetric:

View Task https://phabricator.wikimedia.org/T295298
Milimetric moved this task from Ready to Deploy to Done on the
Data-Engineering-Kanban board.
Milimetric added a comment.

Verified that the changes are now live. @Kipala
https://phabricator.wikimedia.org/p/Kipala/, the approach we decided
to go for was to link directly to the metric definitions. Previously,
the pageview detail pages were linking to the general Pageview
definition page. With this change now live, they link to the relevant
sections of the metric documentation page. We did it this way so as
not to force lots of re-translation of the descriptions. When we have
some more time to put into this, we'll draft much better descriptions
and send those for translation. Let me know what you think, and if
there's anything else that needs clearing up. The description of the
total pageviews metric was changed though, since that was super confusing.

*TASK DETAIL*
https://phabricator.wikimedia.org/T295298

*WORKBOARD*
https://phabricator.wikimedia.org/project/board/5463/

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *Milimetric
*Cc: *JAllemandou, Kipala, Aklapper, EChetty, Dachosen2800,
4748kitoko, Saimongoltini, Jsc39, Akovalyov, Bsandipan,
GoranSMilovanovic, TheDaveRoss, terrrydactyl, jeblad, Nemo_bis, jeremyb

This email has been checked for viruses by AVG.
https://www.avg.com