Page MenuHomePhabricator

Add aklapper to analytics-privatedata-users
Closed, ResolvedPublic

Description

Requestor provided information and prerequisites

This section is to be completed by the individual requesting access.

  • Wikitech username: Aklapper
  • Preferred shell username: aklapper (existing account)
  • Email address: (existing account)
  • Ssh public key (must be dedicated key for wmf production): (existing account)
  • Requested group membership: analytics-privatedata
  • Reason for access:

The gadget that got deployed in T195119#5121390 which gathers feedback on technical docs on Wikimedia pages uses EventLogging to save data.
However in the WMF Developer-Advocacy team currently only @srishakatux can access this data since T213780.
Having one more person able to access data would be nice.

  • Name of approving party (hiring manager for WMF staff): @Bmueller
  • Requestor -- Please Acknowledge that you have read and signed the L3 Wikimedia Server Access Responsibilities document:
  • Requestor -- Please coordinate obtaining a comment of approval on this task from the approving party.

SRE Clinic Duty Confirmation Checklist for Access Requests

This checklist should be used on all access requests to ensure that all steps are covered, including expansion to existing access. Please double check the step has been completed before checking it off.

This section is to be confirmed and completed by a member of the SRE team.

  • - User has signed the L3 Acknowledgement of Wikimedia Server Access Responsibilities Document.
  • - User has a valid NDA on file with WMF legal. (This can be checked by Operations via the NDA tracking sheet & is included in all WMF Staff/Contractor hiring.)
  • - User has provided the following: wikitech username, preferred shell username, email address, and full reasoning for access (including what commands and/or tasks they expect to perform)
  • - User has provided a public SSH key. This ssh key pair should only be used for WMF cluster access, and not share with any other service (this includes not sharing with WMCS access, no shared keys.)
  • - access request (or expansion) has sign off of WMF sponsor/manager (sponser for volunteers, manager for wmf staff)
  • - non-sudo requests: 3 business day wait must pass with no objections being noted on the task
  • - Patchset for access request

For additional details regarding access request requirements, please see https://wikitech.wikimedia.org/wiki/Requesting_shell_access

Event Timeline

Aklapper created this task.Mar 30 2020, 6:50 PM
Restricted Application added a project: Operations. · View Herald TranscriptMar 30 2020, 6:50 PM

Change 584676 had a related patch set uploaded (by Aklapper; owner: Aklapper):
[operations/puppet@production] aklapper: access to analytics-privatedata-users

https://gerrit.wikimedia.org/r/584676

OK from my side!

(Operations: On a related note, I now see that https://phabricator.wikimedia.org/maniphest/task/edit/form/8/ is often used but is not linked from https://phabricator.wikimedia.org/project/profile/956/ so I did not find it. Could you maybe edit the tag description and say "You must use this form to create a request: ..."? Plus the first URL is a redirect.)

jcrespo updated the task description. (Show Details)Mar 31 2020, 8:17 AM

So I am guessing it may not be there so people are forced to read the documentation, which is linked and has the request form link. I can add it, but may revert if people start sending requests but clearly haven't read the docs :-D.

jcrespo added a subscriber: Nuria.Mar 31 2020, 8:21 AM

@Nuria as project lead for analytics, I am requesting your ok for the above access. Thank you!

jcrespo assigned this task to Nuria.Mar 31 2020, 8:22 AM
jcrespo triaged this task as Medium priority.Mar 31 2020, 8:30 AM
Nuria added a comment.Apr 2 2020, 7:16 PM

Approved on my end.

Change 584676 merged by Jcrespo:
[operations/puppet@production] aklapper: access to analytics-privatedata-users

https://gerrit.wikimedia.org/r/584676

jcrespo added a subscriber: elukey.Apr 3 2020, 7:34 AM

@Aklapper server access has been deployed, in a few minutes (~30) you should have access to the stats machines.

It is unclear if kerberos access is, additionally, also needed for eventlogging, as per request, could maybe @elukey clarify this (I know it is needed for hive)?

elukey added a comment.Apr 3 2020, 7:41 AM

@Aklapper I checked T213780 and I see that the user in question doesn't have a Kerberos account, how are you guys accessing Eventlogging data?

@elukey: To help SREs on Clinic Duty figure out whether adding someone to a group also needs a Kerberos account, let's annotate the headers in data.yaml?

Change 585692 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] admin: clarify kerberos account creation for analytics-privatedata

https://gerrit.wikimedia.org/r/585692

Change 585692 merged by Elukey:
[operations/puppet@production] admin: clarify kerberos account creation for analytics-privatedata

https://gerrit.wikimedia.org/r/585692

I checked T213780 and I see that the user in question doesn't have a Kerberos account, how are you guys accessing Eventlogging data?

I don't think that we have looked into data recently, and I myself have never tried before - that's why I'm here. :P

@elukey: You seem to be right. I tried a few minutes ago: I successfully ran ssh stat1007, then I entered hive. I got:

Exception in thread "main" java.lang.RuntimeException: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "stat1007/10.64.21.118"; destination host is: "an-master1002.eqiad.wmnet":8020;
jcrespo reassigned this task from Nuria to Aklapper.Apr 6 2020, 6:14 AM

So what is the status of this? Which rights are needed and which are required?

Change 586208 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] admin: add kerberos flag for aklapper

https://gerrit.wikimedia.org/r/586208

elukey added a comment.Apr 6 2020, 6:57 AM

@Aklapper have you tried https://turnilo.wikimedia.org/ or http://superset.wikimedia.org/ ? The latter allows you now to explore hive data via Presto (see SQLLab), I am wondering if it is sufficient for your use case. If so, it wouldn't require an explicit kerberos account (since you wouldn't access hadoop directly), let me know :)

Nuria added a comment.Apr 6 2020, 6:00 PM

I think @Aklapper is going to need kerberos cause the work @srishakatux was doing requires access to hadoop

^See last comment.

@Aklapper I have just created your Kerberos account. You will have received a mail to your wikimedia.org address with instructions how to log in and change the initial/temporary password.

Change 586208 merged by Muehlenhoff:
[operations/puppet@production] admin: add kerberos flag for aklapper

https://gerrit.wikimedia.org/r/586208

Aklapper added a comment.EditedApr 7 2020, 10:50 AM

Thanks everyone!

Superset needs an internal user created, this is handled by the analytics team, reassigning to Luca for further setup

elukey added a comment.Apr 7 2020, 2:00 PM

Thanks everyone!

You are a user of an experimental service, welcome! :) Jokes aside, I'd be happy if you could help testing the new service, it should be quicker for you use case to query data from Superset rather than Hive. If you don't have time, feel free to skip!

Completely right, we still need to create some docs about Presto + Superset.

I cannot really repro, this is weird, and I didn't find traces of your error in the logs either. Maybe we could debug this together on IRC when you have a moment? Might be quicker, sorry :(

  • The site constantly asks me to re-login after every action (using Chromium 80 in a private window).

This could be an occurrence of https://phabricator.wikimedia.org/T224159, maybe due to Chromium. Can you try with another browser, if possible, and report back?

Nuria added a comment.Apr 7 2020, 3:27 PM

@Aklapper let's see:

Sorry, if this was confusing, while you can create dashboards with presto and superset * I think* your goal is to make those available externally and superset is not the right tool for that.

I propose to resolve this task as I can access hive via SSH (and that was my goal). :)

@elukey: Regarding Superset only:

  • The site constantly asks me to re-login after every action (using Chromium 80 in a private window).

This could be an occurrence of https://phabricator.wikimedia.org/T224159, maybe due to Chromium. Can you try with another browser, if possible, and report back?

Same on another machine with Chromium 80 - I click "Run Query" and it asks me to sign in, plus widget below query text field only shows "Offline".
In Firefox 75, the first I get after login is "Error while fetching database list" at the bottom. Hence cannot try anything else. Separate ticket? :)

elukey closed this task as Resolved.Apr 9 2020, 12:26 PM

+1 for the new task!