Page MenuHomePhabricator

Manipulation of pageview statistics
Open, Needs TriagePublic

Description

Since one month we have had unusual results in the pageview statistics of the German Wikipedia. A moderately well-known musician and two of his projects are high in the Top 10, which is not really comprehensible: [[de:Tobias Sammet]], [[de:Avantasia]], [[de:Edguy]]

This is likely to be a bot-based manipulation. The statistics are displayed in the mobile apps, so you could promote something in this way.

The articles cound be excluded from the statistics, perhaps there are also countermeasures that directly affect those bots,. Alternatively, the display of the most frequently viewed articles could also be removed from the APP in order to combat this manipulation.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 16 2019, 11:43 AM

We also had this issue before, see [[de:Anthocyane]], s. [[de:Diskussion:Anthocyane#Aufrufzahlen]]

Aklapper removed Superbass as the assignee of this task.Sep 16 2019, 12:33 PM

Hi @Superbass, thanks for taking the time to report this and welcome to Wikimedia Phabricator!

Removing wikipedia.de as this does not seem to be about wikipedia.de but de.wikipedia.org (or MobileFrontend code) and resetting assignee as per https://www.mediawiki.org/wiki/How_to_report_a_bug

Please also provide a link to the "pageview statistics of the German Wikipedia" to make sure everybody is talking about the same thing. Thanks.

Superbass added a comment.EditedSep 16 2019, 8:08 PM

The pageview statistics are here, with the mentioned articles on rank 2,3,4_ https://tools.wmflabs.org/topviews/?project=de.wikipedia.org&platform=all-access&date=last-month&excludes=

MusikAnimal added a subscriber: MusikAnimal.EditedSep 17 2019, 12:13 AM

I removed [[Formelsammlung Trigonometrie]] from Topviews as an obvious false positive, though I realize that wasn't reported in this task.

I am not sure about the three musician pages. The mobile/desktop ratio seems normal, so it will require further investigation of private data to confirm it is fake traffic. This is very tedious, but I can try to look into when I have the time.

I think MobileFrontend was tagged because these pages are showing up in the app (which I think would be Wikipedia-Android-App-Backlog and/or Wikipedia-iOS-App-Backlog), but this isn't really the app's fault. The core issue is T123442: Pageview API: Better filtering of bot traffic on top enpoints. Topviews and the app are merely pulling data from there.

Thank you for your intervention and comment. I would recommend to take out the three articles about musicians/bands as well. The pageviews of the three articles are synchronized over weeks, and they are far too high as these items have no corresponding presence in the news or charts.

https://tools.wmflabs.org/pageviews/?project=de.wikipedia.org&platform=all-access&agent=user&start=2019-06-17&end=2019-09-15&pages=Tobias_Sammet|Avantasia|Edguy

About the mobile frontend: I thought it would be an option to temporarily remove the "Most read on Wikipedia" section from the app, if it is such an easy target for manipulations. That was just a suggestion, I don't know if that's a good idea. Topviews, on the other hand, do not play a major role on the Web interface.