Page MenuHomePhabricator

Requesting access to stats machines for Lucas Werkmeister
Closed, DeclinedPublicRequest

Description

Username: lucaswerkmeister-wmde
Full name: Lucas Werkmeister
Public key: P6884 (freshly generated, private key file protected by a strong password using 100 KDF rounds)
Reason: I want to run long-running queries, e. g. to analyze usage of the WikibaseQualityConstraints extension (how many API requests are from our own gadget and how many from other users – cf. T180780) and to determine which logging table patrolling entries refer to autopatrols and can be deleted (see T189594#4071518).

Ops Clinic Duty Checklist for Access Requests

Most requirements are outlined on https://wikitech.wikimedia.org/wiki/Requesting_shell_access

This checklist should be used on all access requests to ensure that all steps are covered. This includes expansion to access. Please do not check off items on the list below unless you are in Ops and have confirmed the step.

  • - User has signed the L3 Acknowledgement of Wikimedia Server Access Responsibilities Document.
  • - User has a valid NDA on file with WMF legal. (This can be checked by Operations via the NDA tracking sheet & is included in all WMF Staff/Contractor hiring.)
  • - User has provided the following: wikitech username, preferred shell username, email address, and full reasoning for access (including what commands and/or tasks they expect to perform.
  • - User has provided a public SSH key. This ssh key pair should only be used for WMF cluster access, and not share with any other service (this includes not sharing with WMCS access, no shared keys.)
  • - access request (or expansion) has sign off of WMF sponsor/manager (sponser for volunteers, manager for wmf staff)
  • - non-sudo requests: 3 business day wait must pass with no objections being noted on the task - analytics-privatedata-users group
  • - sudo requests: all sudo requests require explicit approval during the weekly operations team meeting. No sudo requests will be approved outside of those meetings without the direct override of the Director of Operations.
  • - Patchset for access request - user is in ldap section, will need to be moved up to shell users.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
RobH triaged this task as Medium priority.EditedMar 22 2018, 8:36 PM
RobH updated the task description. (Show Details)
RobH subscribed.

@Lucas_Werkmeister_WMDE:

Can you review https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Access_Groups as it outlines what the access groups are for the stat machines, and let us know which of these groups would let you accomplish what you need done.

There is also an additional group not on that page:

analytics-wmde-users:
  description: Group of WMDE analytics users
  gid: 784
  members: [addshore, goransm]
  privileges: ['ALL = (analytics-wmde) NOPASSWD: ALL']

If that is the group you need, it will also require a SRE meeting review (every Monday) for approval, as it includes sudo rights. However, no actions can be taken until we clarify this with you, so please advise.

Also please confirm you want us to use the same email address tied to your wikitech account is 'lucaswerkmeister' and you'd like us to use the email address tied to that account. Please note this address will become public, since it will be in our git repo.

Most shell requests also have the sponsorship of someone on the WMF staff. Do you have a contact person you work with @WMF who would be ideal for this sponsorship? If so, please have them comment on this task.

I think analytics-privatedata-users is the group I need. (As far as I can tell from the config file, that doesn’t have any sudo rights, correct?)

Also please confirm you want us to use the same email address tied to your wikitech account is 'lucaswerkmeister' and you'd like us to use the email address tied to that account. Please note this address will become public, since it will be in our git repo.

Sorry, I’m not quite sure what you mean here – so far I’ve been trying to keep some 'WMDE' in my work account names, so if a 'lucaswerkmeister' account exists, that would probably be my private account (which shouldn’t have access). But I can confirm that I would like to use the email address tied to wikitech:Lucas Werkmeister (WMDE). (That is, lucas.werkmeister@wikimedia.de.)

Also, I’d misspelled my shell username in the task description earlier, sorry – it’s lucaswerkmeister-wmde with a dash, not an underscore.

@Lucas_Werkmeister_WMDE: I'll go ahead and prepare the patchsets, however we're still lacking a WMF staff sponsorship on this request. Is there a particular staff person you work with regularly we can ping to sign off on this request?

Can you be a bit more explicit on your request?

I want to run long-running queries, e. g. to analyze usage of the WikibaseQualityConstraints extension

You should not need access to cluster to analyze usage of extension correct? Can this not be inferred from wiki databases data?

which logging table patrolling entries refer to autopatrols and can be deleted

Again, is this data on wiki databases?

Can this not be inferred from wiki databases data?

I don’t quite understand what you mean, sorry… in this case, I would distinguish between API requests from our own gadget and truly external API requests via an extra HTTP header. I thought that information on individual requests, like the HTTP headers, was best processed on the stats machines? (At least, I assume it shouldn’t be available on the Labs mirrors, since they contain sensitive data and all that.)

Again, is this data on wiki databases?

In this case, it’s less about the data itself and more about doing the heavy processing outside of the production servers. The intended final outcome of this project is to drop a large number of rows from the logging table in production, but since figuring out which rows to drop is fairly expensive, we thought it would be good to perform that part somewhere else, and generate the list of row IDs to drop on the stats machines instead.

@RobH – I’m afraid I don’t know a lot of foundation folks… even the non-German people I work with turn out to be other WMDE people :/

@Lucas_Werkmeister_WMDE I think we need to understand a bit more what is what you are doing to recommend, Can you set up a meeting with someone from analytics team? @JAllemandou and @mforns are on your timezone

Dzahn changed the task status from Open to Stalled.Apr 17 2018, 5:28 PM
Dzahn subscribed.

Setting to Stalled, unless that meeting has already happened. In that case, please let us know the status here on ticket.

Sorry, forgot to update – the meeting happened, I’ve created T192452: Find out how many external uses the wbcheckconstraints API action has for the outcome. Depending on how we proceed with that task, stats machines access may or may not be necessary. Comments over there are welcome, I don’t understand all the parts involved :)

@Lucas_Werkmeister_WMDE Thank you for the update! I will say the status "stalled" is still correct in that case until we find out whether it's needed.

But that was very helpful to know the background. I'll just keep this open and make it a subtask.

RobH removed Lucas_Werkmeister_WMDE as the assignee of this task.

Since this has been stalled for nearly two weeks, I'm going to go ahead and close it as declined. If the conversation and outcome on T192452 lead back to this needing to be re-opened, please feel free to do so.

I just rather it not sit in ops-access-requests so long that it becomes something we simply pass over during dashboard review. It seems better to decline and clear off the board, and re-open later if needed, so then it isn't just passed over accidentally.