Wed, Jul 18
Of interest: https://caniuse.com/#feat=referrer-policy
Tue, Jul 17
Closing sounds good
That error rate is, ahem, nothing , I bet is higher in any other schema. Let's close ticket.
Mon, Jul 16
Per conversation with @JKatzWMF
Please let me know if you guys think this is something it can happen this quarter
ping @JAllemandou this issue might be significant for "total article count"
Bug on turnilo, looks like the workarround in to change your tz ....
@Krinkle sqooping is different per wiki size, thus it requires a whilelist to manage it. See similar addition: https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/445445/
@phuedx developers with access to stats machines (i think most (all?) of your team) can also pull errors, you can see those in real time
yes , please, I listed issue on dataset page: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Unique_Devices#Changes_and_Known_Problems_with_Dataset
We do not yet have annotations in wikistats (we will at the end of quarter) but when we do this is a good one to list. Moving ticket to bot work.
Sat, Jul 14
Bot did not accepted cookies, user agent was changing slightly, in 1000 records when this event is happening 995 are part of event and of those about 200 are unqiue user agents. Still the IP is teh same and the volumes of requests so high that I am wondering how these requests did not get throttled. Will look at throttling limits.
Fri, Jul 13
It coincides with a spike of pageviews from thailand, that seems like a bot accessing teh desktop size, will investigate a bit as to whether this bot was accepting cookies.
Nuria to look into the differences between "unknown" and "none"
Please reopen if needed
browser versions cannnot be cleanly converted to numbers cause data has '-' . to be clear, it is possible to filter those in turnilo but filtering data with no entry for say, minor version seems like a bad idea. So, I have added numerical dimensions to response_size and time firstbyte but left browser versions as they were. If you think that having "-" on versions atht we were unable to parse is not a a good idea please let us know.
The code that does this: https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/UAParser.java#L34
indeed this works, see pretty slider in turnilo
Thu, Jul 12
I think I figured out how to do this via changing config in turnilo. Will report. See: https://github.com/allegro/turnilo/blob/master/docs/configuration.md#custom-transformations
Let's see, the graph you attached shows the "number of pageviews per referral class per day for 2 years for wikipedias". So in a day like for example: June 25th 2017 you have 44 million pageviews tagged with "external -search engine" referrer, 36 million tagged with "none" and 28 million tagged with "internal" and 2.4 million tagged with "external"
Should we remove https://edit-analysis.wmflabs.org/editor-engagement/?
ping @Seeris we hope to have stats for you by mid July
Very sorry we dropped the ball on this, i will add ajtwiki to our whitelist and you should have data (since the very beginning by early July when our next process to scoop data runs). We reconstruct history from the very beginning and thus processes take a few days to run, that is why we run them monthly.
According to sitematrix:
Declining as there is no possible fix we can do.