Page MenuHomePhabricator

One-time pageview peaks
Closed, DeclinedPublic

Description

In ruwiki we sometimes analyze different data for articles. And we saw several anomalous peaks of views.

https://ru.wikipedia.org/wiki/Википедия:Форум/Архив/Новости/2019/01#Статистика_пиков_просмотров_статей_в_2018_году

For example we can not explain the short peak 500k+ in unpopular articles:

https://tools.wmflabs.org/pageviews/?project=ru.wikipedia.org&platform=all-access&agent=user&start=2018-08-07&end=2018-08-09&pages=Городское_поселение_Одинцово
https://tools.wmflabs.org/pageviews/?project=ru.wikipedia.org&platform=all-access&agent=user&start=2018-03-31&end=2018-04-02&pages=М_(фильм,_1931)

Is it possible to somehow investigate whether it was a computer failure (сould this be a wiki-bug; maybe these views were generated by some other article and were counted for this article by bug), an attack of bots (сould this be a strange use of Wikipedia services; was it in a short time (by the hour); attack from infected devices), interest only from one country (it is unlikely that these are views from a link from another site) or something else(is it normal or not)?

In previous years, such peaks were also seen in other articles.


a sharp start of ~300k+ views with a long ongoing period when an article has existed for a long time with a low number of views

https://tools.wmflabs.org/pageviews/?project=ru.wikipedia.org&platform=all-access&agent=user&start=2018-01-01&end=2018-12-29&pages=Borderlands:_The_Pre-Sequel! <-- link with ! in the end

https://tools.wmflabs.org/pageviews/?project=ru.wikipedia.org&platform=all-access&agent=user&start=2018-01-01&end=2018-12-29&pages=Borderlands:_The_Pre-Sequel%21

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Tbayer added subscribers: MusikAnimal, Tbayer.

Just so as to not leave this without reply for longer: WMF analysts usually don't have capacity to prioritize thorough investigations of such isolated incidents, in particular if it looks unlikely that they are indicative of a more widespread underlying technical problem. Also, the two peaks mentioned in the task (from August 2018 and March/April 2018, respectively) are outside the range for which we would still have more detailed webrequest data to investigate. On the other hand, the public pageview data (e.g. in the first case above, also note that this was concentrated on mobile web only) already makes it almost certain that this was artificial traffic.

That said, the impact of such peaks on (say) the understanding of the popularity of particular article topics can be huge, and there is some value in tracking such occurrences and helping users to determine whether they are likely to be artificial. @MusikAnimal might be working to extend his existing notes into a FAQ, and we asked for the creation of a Pageviews-Anomaly tag on Phabricator to collect tickets on this matter which should facilitate coordination.

kzimmerman subscribed.

It looks like this should be documented either under Research:Page View, along with other anomalies or in another centralized spot. Moving to unprioritized/icebox for now because we won't be able to tackle it anytime soon.

mpopov subscribed.

We're not able to work on this request.