Mar 3 2017
Apr 28 2016
Apr 15 2016
Yeah, we actually have finalised it I've just been overwhelmed with
the first week of $newjob and haven't written them down yet.
Mar 19 2016
So to summarise, it is now impossible to conduct this research in the way specified?
Resolving since I'm not seeing a "this hasn't been fixed as evidenced by..."
To be perfectly honest it's pretty pointless; the definition doesn't care if a HideBanners request has got a shortlink or not, it excludes it either way.
Mar 17 2016
Now unblocked, should wrap up tomorrow (Friday)
Mar 16 2016
Okay, I quit Friday. Could we please save discussion about long-term solutions for, you know, the long-term? Because not having things break in 2 days is more my priority right now.
Mar 15 2016
Sure, then we'll look at that when we're no longer on a timer.
And the files live where? (Otto is suggesting /a/)
Well, not just cron jobs but also have somewhere to store the scripts the jobs are triggering.
Okay, then this was a fundamental misunderstanding from the get-go.
But analytics-search apparently doesn't have a /home/ directory so is not actually what we were looking for.
Sooo how exactly does one *use* it? su analytics-search-user complaints about passwords. Otto maintains there isn't one.
- We should note that you can install from gerrit with the GitHub mirrors
- I don't see much value in the expansive visualisation guidelines but that's probably just because I only use fivethirtynine!
And can the client-side JS collect a hashed IP too?
Mar 14 2016
I feel like we're talking at cross-threads here.
Okay, then how does the geolocation work, and could however the geolocation works also pass through a hashed and salted IP?
Okay, so is everyone okay with this if we drop connection-type data and, by extension, drop the need for the IP address? And just return a hashed and salted IP and the country-code, instead of the unhashed IP.
Mar 9 2016
Signed! Okay, that gives us a 4-day window to switch everything over.
Mar 8 2016
Done, waiting on response.
Now FINALLY up at https://meta.wikimedia.org/wiki/Discovery/Data_access_guidelines
As always, you save my bacon. Thanks ;).
Any chance the EL geolocation can get connection type as well, then? If so a hash would do fine, but I was operating under the understanding it couldn't (hence the request for a raw IP)
Now expanded! Please let me know if there are specific questions it does not answer you would like clarified.
Just for transparency I'm rewriting/expanding the documentation on meta as we speak to include the data sanitisation and privacy concerns, the sampling approach, and the checkins we've done with other teams on utility and appropriateness.
Indeed, I'll expand the documentation. Legal has given their clearance as part of our conversation with them but we should move the useful bits of that thread on to metawiki.
I wasn't aware anyone involved in the research had asked for any analytics engineers' support? Absent "does anyone know where I'd look for examples of how to implement X"
Waiting on https://phabricator.wikimedia.org/T129260
We have emailed otto as an extra ping.
Essentially the same.
Mar 7 2016
Mar 4 2016
Mar 3 2016
Well I've forgotten how to log into labs so I'll be no help there. @EBernhardson knows how this works and can describe it better than I.
I dispute the idea there there are any circumstances where one-line conditionals are acceptable! *stamps foot*
That is, explain to you what to do when I leave? You know the security/access groups? Add the new staffer to the groups for the shiny machines ;p
Huh. The pageID approach won't work? :(
Mar 2 2016
Uh. We should refactor the code before we have the dependency refactored?
I have literally no idea, it was in 2014.
Mar 1 2016
Needs a password I don't have :(
@mpopov what was the result of the -analytics chat yesterday? (I don't recall)
That's fair. Julien, what's your idea?
Feb 29 2016
Draft created and dispatched to the lawyers.
Feb 26 2016
Well, if both will download both this won't actually help :/. What we'd do on literally any other part of Wikimedia's infrastructure is rely on the fact that the platform detection in our varnish caches redirects people to *.m.* as appropriate, which it doesn't do for the portals. I was seeking to distinguish the files so we could also look for file download, but if both files are going to be grabbed anyway that won't actually do much :(. Ah well.
I don't understand your example. If you'd rather read the French Wikipedia, then your preferred language is...French. For the purposes of browsing Wikipedia, at least.
Feb 25 2016
We're only collecting for Goal 1 at the moment since Goal 2 has privacy implications.
Note that Michelle and Dario have both now given permission in email for this to go ahead.
Feb 24 2016
Sounds good to me!