Event Timeline
Offset spikes and underestimate decreases in the same amount (so total is same) . Best seen in wikidata mobile: https://goo.gl/o8oAzj
Summing up from IRC's conversation between @Nuria and @ema:
From the 2nd of November we start seeing a shift of the Unique Devices data per domain. The totals of Unique Devices are mostly not affected but the Unique devices computation is made of two parts: 1) (underestimate) users for whom the cookie is set + 2) (offset) users that have no cookies.
From about November 2nd to February 6th (2017) we see that the proportion of devices on the offset is much bigger than it was prior. And, the proportion of users on underestimate is smaller. What this tells us is that cookies seem to be expiring sooner than they should.
This matches up with varnish4 progressive rollout: https://gerrit.wikimedia.org/r/#/q/topic:varnish4-upgrade+(status:open+OR+status:merged)
This is the offset data for wikidata mobile, which represents devices coming w/o a last access cookie.
@ema: has the way we compute nocookies flag on X-Analytics changed? It should take into account "all" cookies not just last access. I think that from the code in github {1] nothing has changed but asking just in case.,
This is probably unrelated but does this way of setting cookies (geoIP) [2] make them visible on the http.cookies object in varnish?
[2]
https://github.com/wikimedia/puppet/blob/production/modules/varnish/templates/geoip.inc.vcl.erb#L179
Is this something we still need answers for, or have we just moved past it into a new normal?
Three years have passed from this incident and as result, there's no data left from that time to examine. It's not happening anymore (https://stats.wikimedia.org/#/wikidata.org/reading/unique-devices/normal|line|3-month|~total|daily) and we have lots of new means to detect bot traffic these days. I close this as declined. Feel free to reopen.