Page MenuHomePhabricator

Access to restbase servers (including sudo) for Imarlier
Closed, ResolvedPublic

Description

Hi there --

I've been asked to help Core Platform look in to an issue with restbase, that appears to be related to hardware/IO. In order to do this I need access to these hosts, including sudo. Can someone please hook me up?

@Fjalapeno Is the Director in charge of that team and can approve.

Thanks!

  • Ian

SRE Clinic Duty Checklist for Access Requests

Most requirements are outlined on https://wikitech.wikimedia.org/wiki/Requesting_shell_access

This checklist should be used on all access requests to ensure that all steps are covered. This includes expansion to access. Please do not check off items on the list below unless you are in Ops and have confirmed the step.

  • - User has signed the L3 Acknowledgement of Wikimedia Server Access Responsibilities Document.
  • - User has a valid NDA on file with WMF legal. (This can be checked by Operations via the NDA tracking sheet & is included in all WMF Staff/Contractor hiring.)
  • - User has provided the following: wikitech username, preferred shell username, email address, and full reasoning for access (including what commands and/or tasks they expect to perform.
  • - groups to include user into have been determined
  • - User has provided a public SSH key. This ssh key pair should only be used for WMF cluster access, and not share with any other service (this includes not sharing with WMCS access, no shared keys.)
  • - access request (or expansion) has sign off of WMF sponsor/manager (sponser for volunteers, manager for wmf staff)
  • - non-sudo requests: 3 business day wait must pass with no objections being noted on the task
  • - sudo requests: all sudo requests require explicit approval during the weekly operations team meeting. No sudo requests will be approved outside of those meetings without the direct override of the Director of Operations.
  • - Patchset for access request

Event Timeline

Aklapper renamed this task from Access to restbase servers (including sudo) to Access to restbase servers (including sudo) for Imarlier.Aug 22 2018, 7:10 PM

@Imarlier There is restbase-roots and restbase-admins. Both have sudo privileges but different levels of it. A restbase-admin can control the restbase and cassandra services and run any command as the cassandra and restbase user. A restbase-root can run any command as any user.

RobH triaged this task as Medium priority.
RobH updated the task description. (Show Details)
RobH subscribed.

@Imarlier,

We'll need the following from you to make this happen:

  • - determination on which group you need. Please select the most restrictive group you can use to get the job done.

@Imarlier There is restbase-roots and restbase-admins. Both have sudo privileges but different levels of it. A restbase-admin can control the restbase and cassandra services and run any command as the cassandra and restbase user. A restbase-root can run any command as any user.

  • - signoff of you manager for the increase in rights

After that, as this is sudo, it will require approval in the weekly SRE team meeting. The meeting will review the request, and grant if they deem it in scope and acceptable. Please keep in mind the more support you can provide for the access, the more likely it will be granted.

@Imarlier: We've not gotten any feedback on which of the two groups you need, or if it is both? As such, I'm not sure what to present in our SRE meeting for approval.

As this meeting is taking place shortly, this will not be reviewed this week. Please provide feedback (requested on 2018-08-22) on which group you need.

This comment was removed by Dzahn.

@RobH Sorry about that, I missed your followup.

I _think_ that restbase-root is most appropriate, but could be misevaluating.

The questions that have been raised are about the actual IO capacity of these systems, and whether there are underlying hardware or tuning issues that cause performance issues in some instances -- for example, restbase1014 consistently shows iowait values in the 5-15% range, restbase1015 shows user CPU of 10-15%, and restbase1016 shows lower values for both -- despite the fact that all three machines should be serving approximately the same workload.

I'm not 100% sure where the investigation will go (having not investigated :-), but I can imagine needing iostat, sysstat, atop/htop, smartmon, and potentially strace.

Hopefully that helps to clarify?

Agreed, restbase-roots looks like the right group for you. We now need manager approval, and then this can go to the SRE meeting next week for review.

This is waiting for the next SRE meeting for review.

This was approved in the SRE meeing on Monday.

Change 460394 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Add Ian to restbase-roots

https://gerrit.wikimedia.org/r/460394

Change 460394 merged by Muehlenhoff:
[operations/puppet@production] Add Ian to restbase-roots

https://gerrit.wikimedia.org/r/460394

I've added Ian to restbase-roots.