Page MenuHomePhabricator

Access to search logs for Jan Dittrich
Closed, ResolvedPublic

Description

Story: For working on advanced search (German Community Wishlist) I would like to run queries on which elastic parameters are used most frequently

Problem: I have signed the NDA and have access to stats 1003 but not to the search data

Could I get access to the search data?

Event Timeline

Restricted Application added a project: Operations. · View Herald TranscriptMay 21 2017, 8:31 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

We need manager approval for this please.

I report to @Abraham – Abraham, for the technical wishlist, I need to analyze search queries, and for that the approval of the person I report to is needed (cc @Lea_WMDE, PM technical wishlist)

RobH added subscribers: RStallman-legalteam, RobH.EditedMay 22 2017, 4:35 PM

It seems that Jan Dittrich is already a shell user, and all shell users have confirmed NDA on file. (It is now a requirement for shell access.)

So this needs to detail exactly what groups to add the user to, as well as manager approval.

Edit change: I had listed off that NDA confirmation is required, but it was already confirmed for existing shell users, like this one.

Confirming that Jan Dittrich has a NDA on file for shell access.

@elukey @Ottomata, I am not sure what this entails, care to help? I am looking at https://wikitech.wikimedia.org/wiki/Analytics/Data_access and I am not sure which of the 6 groups are the ones that would grant the above access. analytics-wmde sounds like it fits but for the wrong reason (requestor being part of WMDE, not because of what data they request access to). analytics-privatedata-users sounds correct too but from the looks of it, but from the looks of the request, it can probably fullfilled without access to privatedata as well, so analytics-users sounds fine as well. Any tips on how to best resolve this ? Also, who should approve this from WMF ?

I'm also not totally sure what data Jan is looking for, but if I had to guess, it would be webrequest logs, which would mean that analytics-privatedata-users group is the one we want.

@EBernhardson might be able to confirm. Can Jan find elasticsearch parameters in webrequest logs, perhaps in uri_path or uri_query?

I'm also not totally sure what data Jan is looking for

We would like to find out with parameters (like AND, OR, intitle: …) and namespaces (like help:…) users use when they use the on-wiki search (the one right-top corner)

Ah, ok. I betcha this is in the cirrussearchrequestset Hive table maintained by the Discovery folks.

This data is currently accessible by the analytics-users group. analytics-privatedata-users would also provide access, but if you don't need access to webrequest or other private data, then let's go with analytics-users.

Change 355599 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Add jdittrich to analytics-users

https://gerrit.wikimedia.org/r/355599

RobH added a comment.May 30 2017, 4:15 PM

This request has had the actual group figured out now for the 3 day wait. As there are no objections noted, I'm merging Alex's patchset live.

Change 355599 merged by RobH:
[operations/puppet@production] Add jdittrich to analytics-users

https://gerrit.wikimedia.org/r/355599

RobH closed this task as Resolved.May 30 2017, 4:18 PM
RobH claimed this task.