Page MenuHomePhabricator

Grant root access for Bryan Davis on labstore* and admin for maintain scripts for labsdb*
Closed, ResolvedPublic

Description

@bd808 should be able to do admin things on labstore* and labsdb* - looking at failed service logs, applying puppet patches, debugging labsdb user credential issues, to list a few things. We can grant him access to these servers through the wmcs-roots group that already exists.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Giving root to labsdbs would be equivalent to giving root to all mysql servers for many reasons. No problem with that, but he should be added to the paging system (if he is not already there) and respond to the database alerts.

bd808 renamed this task from Grant sudo access for bdavis for labstore* and labsdb* to Grant sudo access for Bryan Davis for labstore* and labsdb*.May 25 2017, 3:08 PM

No problem with that, but he should be added to the paging system (if he is not already there) and respond to the database alerts.

I do not have a Foundation provided phone or phone contract, so the idea of getting all production pages as SMS messages is not completely appealing. I am willing to receive pages for servers that the cloud-services-team is primarily responsible for maintaining. If that costs me money it will probably motivate me to work with the team to decrease false positive alerts and increase general stability as part of our planned programmatic work.

Change 355463 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/puppet@production] Labs: Add wmcs-roots admin group to NFS servers

https://gerrit.wikimedia.org/r/355463

I do not have a Foundation provided phone or phone contract

I do not have either.

Maybe cloud team can create a dedicated contact group and take over all labsdb maintenance? I would be quite cool with that.

In fact, I want to remember that the group dba used to receive db-only pages, but not sure if it was soft-disabled because it was incompatible with IRC logging.

I'll be on clinic duty next week, so I am just jumping in here to clarify what is pending on this:

  • This request is sudo related, so it will require review in the weekly operations meeting. Next week's meeting has been rescheduled from Monday (due to US holiday) to Wednesday. This request will be listed for review there.
  • There seems to be an ongoing discussion if having root on something means you have to be paged. Typically, we page roots for outages and service incidents on servers where they have root. I'm uncertain if this has always held true, since we have had developers with root (Brion, Tim, Roan, Ori.) The point that applying puppet patches (included in reasoning of this request) should also include paging seems valid, but I'm not sure we've had it as a requirement? (It seems like this will be a topic of discussion in the ops meeting as well.)

This won't give me +2 in ops/puppet.git so "applying puppet patches" probably isn't actually possible. I would be able to force puppet runs and/or disable puppet when needed.

So, I was chatting with Roan about the access he had, when he had root. His root access back then did not immediately require that he be paged, and he had it to fix permissions issues, and later to gather information on various systems/services he was working with. Since he wasn't also +2 in operations/puppet at that time, he didn't really have the full rights to do cluster wide automated changes. He was later added to paging (for all outages) when he wanted to be paged for parsoid, but that wasn't directly tied to his granting of root.

We don't really have a set policy on this, as our use of paging for root hasn't always been applied as a hard requirement.

The next ops meeting is scheduled for this Wednesday, June 7th. This will be listed for review.

If we want to holdoff on the labsdb root inclusion I am going to propose in the opsen meeting this task become:

  • root on labstore* things
  • wmcs-admin group creation with ability to run only maintain-views and maintain-meta_p on labsdb* hosts

If we want to holdoff on the labsdb root inclusion I am going to propose in the opsen meeting this task become:

  • root on labstore* things
  • wmcs-admin group creation with ability to run only maintain-views and maintain-meta_p on labsdb* hosts

This revised proposal was approved in the opsen meeting on 6/7/17 by general consensus.

chasemp renamed this task from Grant sudo access for Bryan Davis for labstore* and labsdb* to Grant root access for Bryan Davis on labstore* and admin for maintain scripts for labsdb*.Jun 7 2017, 4:58 PM

This proposal by chasemp right above has been approved in ops meeting today.

Change 355463 merged by Dzahn:
[operations/puppet@production] WMCS: add access for bryan davis

https://gerrit.wikimedia.org/r/355463

I confirmed user and group wmcs-roots has been created on: labstore1003 (for the root on labstore* part) and user and group wmcs-admin has been created on labsdb1009 (for the second part)

Dzahn closed this task as Resolved.EditedJun 7 2017, 6:17 PM
Dzahn claimed this task.

@bd808 ^ this should be resolved now. all other hosts should work within max. 30 min too (puppet). Let us know if you see any issues.

`[labsdb1009:~] $ id bd808
uid=3518(bd808) gid=500(wikidev) groups=500(wikidev),793(wmcs-admin)

[labsdb1009:~] $ sudo cat /etc/sudoers.d/wmcs-admin 
# This file is managed by Puppet!

%wmcs-admin ALL = (ALL) NOPASSWD: /usr/local/sbin/maintain-views
%wmcs-admin ALL = (ALL) NOPASSWD: /usr/local/sbin/maintain-meta_p

Change 359323 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/puppet@production] WMCS: add wmcs-admin to labsdb100[13]

https://gerrit.wikimedia.org/r/359323

Change 359323 merged by Dzahn:
[operations/puppet@production] WMCS: add wmcs-admin to labsdb100[13]

https://gerrit.wikimedia.org/r/359323

user has also been created on labsdb1001 and labsdb1003