Page MenuHomePhabricator

Provide a pageview API which pre-filters transient spikes from a few days or so
Closed, DuplicatePublic

Description

[Taken from https://en.wikipedia.org/wiki/Wikipedia talk:Web statistics tool; filed on behalf of EllenCT.]

Filtering transient spike anomalies

The WP:POPULARLOWQUALITY list suffers from the "known anomalies" problem of transient spikes.

There is a general solution to a closely related problem in the R code on pp. 191-2 of //Xu et al.// (2014) but I think it's much easier to use some measure of whether a spike less than four days long have standard deviations above, say +6, or one of the algorithms in http://stats.stackexchange.com/a/56744.

Can we get a separate API resource access for a filtered top-1000, please?

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald Transcript

@Jdforrester-WMF
Please, could you provide more context, because if you're talking about spikes that are due to bots, there is already a phab task for that.
However, if you are referring to other causes, we will not be able to work on this for some time, because we're focusing on edit data right now.
Thanks!