Page MenuHomePhabricator

Request additional access for Dcops group
Open, MediumPublic

Description

Opening this task to address access requests to addtional commands so Dcops group can perform day to day duties

perccli64

context:
IF historically has been working on reducing the number of people with global root-level access see: [T244840] and [T289779]. Additional considerations for added security controls for SRE edge cases exist [T299989].

Event Timeline

Jclark-ctr triaged this task as Medium priority.Jun 3 2025, 5:23 PM

Looks good, we already have megacli and hpssacli in the existing rules. If while we're at it, let's also add storcli?

Change #1161382 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] admin: allow dcops to use perccli and storcli via sudo

https://gerrit.wikimedia.org/r/1161382

Change #1161382 merged by Elukey:

[operations/puppet@production] admin: allow dcops to use perccli and storcli via sudo

https://gerrit.wikimedia.org/r/1161382

Deployed! @Jclark-ctr please test and report back if anything is missing :) Puppet is currently rolling out the change, so give it one hour to propagate properly!

@elukey I finally have servers to test this on do we not have storcli as part of install? i was hoping to do this rather then create x288 VD manually for my install ticket.

an-worker1230:~$ storcli /c0 add vd type=raid0 drives=all wb ra
-bash: storcli: command not found

@Jclark-ctr o/ that server is a Dell, so you'll have to use /usr/bin/perccli64 (it may differ a bit from storcli's syntax but it should do what you need). Lemme know!

What about megacli as well? There are still quite a few older servers that use this.

@elukey Thanks for confirming. I did try that command first, but I was getting failures since it won’t show any controllers without sudo. Here’s my output compared to @BTullis’ output.

jclark@an-worker1230:~$ perccli64 /c0 show
CLI Version = 007.1910.0000.0000 Oct 08, 2021
Operating system = Linux 5.10.0-35-amd64
Controller = 0
Status = Failure
Description = Controller 0 not found
btullis@an-worker1230:~$ sudo perccli64 /c0 show
Generating detailed summary of the adapter, it may take a while to complete.

CLI Version = 007.1910.0000.0000 Oct 08, 2021
Operating system = Linux 5.10.0-35-amd64
Controller = 0
Status = Success
Description = None

Product Name = PERC H755 Adapter
Serial Number = 56N00JN
SAS Address =  5f4ee080808df800
PCI Address = 00:17:00:00
System Time = 09/26/2025 10:41:59
Mfg. Date = 06/25/25
Controller Time = 09/26/2025 10:41:58
FW Package Build = 52.30.0-6115
BIOS Version = 7.30.00.0_0x071E0001
FW Version = 5.300.02-4155
Driver Name = megaraid_sas
Driver Version = 07.714.04.00-rc1
Current Personality = RAID-Mode 
Vendor Id = 0x1000
Device Id = 0x10E2
SubVendor Id = 0x1028
SubDevice Id = 0x1AE0
Host Interface = PCI-E
Device Interface = SAS-12G
Bus Number = 23
Device Number = 0
Function Number = 0
Domain ID = 0
Security Protocol = None
Drive Groups = 1

TOPOLOGY :
========

-----------------------------------------------------------------------------
DG Arr Row EID:Slot DID Type  State BT       Size PDC  PI SED DS3  FSpace TR 
-----------------------------------------------------------------------------
 0 -   -   -        -   RAID1 Optl  N  446.625 GB enbl N  N   dflt N      N  
 0 0   -   -        -   RAID1 Optl  N  446.625 GB enbl N  N   dflt N      N  
 0 0   0   251:0    4   DRIVE Onln  N  446.625 GB enbl N  N   dflt -      N  
 0 0   1   251:1    6   DRIVE Onln  N  446.625 GB enbl N  N   dflt -      N  
-----------------------------------------------------------------------------

DG=Disk Group Index|Arr=Array Index|Row=Row Index|EID=Enclosure Device ID
DID=Device ID|Type=Drive Type|Onln=Online|Rbld=Rebuild|Optl=Optimal|Dgrd=Degraded
Pdgd=Partially degraded|Offln=Offline|BT=Background Task Active
PDC=PD Cache|PI=Protection Info|SED=Self Encrypting Drive|Frgn=Foreign
DS3=Dimmer Switch 3|dflt=Default|Msng=Missing|FSpace=Free Space Present
TR=Transport Ready

Virtual Drives = 1

VD LIST :
=======

----------------------------------------------------------------
DG/VD TYPE  State Access Consist Cache Cac sCC       Size Name  
----------------------------------------------------------------
0/239 RAID1 Optl  RW     Yes     RWBD  -   OFF 446.625 GB VD_R1 
----------------------------------------------------------------

VD=Virtual Drive| DG=Drive Group|Rec=Recovery
Cac=CacheCade|OfLn=OffLine|Pdgd=Partially Degraded|Dgrd=Degraded
Optl=Optimal|dflt=Default|RO=Read Only|RW=Read Write|HD=Hidden|TRANS=TransportReady
B=Blocked|Consist=Consistent|R=Read Ahead Always|NR=No Read Ahead|WB=WriteBack
AWB=Always WriteBack|WT=WriteThrough|C=Cached IO|D=Direct IO|sCC=Scheduled
Check Consistency

Physical Drives = 14

PD LIST :
=======

-------------------------------------------------------------------------------------
EID:Slt DID State DG       Size Intf Med SED PI SeSz Model                   Sp Type 
-------------------------------------------------------------------------------------
251:0     4 Onln   0 446.625 GB SATA SSD N   N  512B MTFDDAK480TGA-1BC1ZABDA U  -    
251:1     6 Onln   0 446.625 GB SATA SSD N   N  512B MTFDDAK480TGA-1BC1ZABDA U  -    
252:0     7 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
252:1     9 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
252:2     8 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
252:3    11 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
252:4    10 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
252:5    12 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
252:6    13 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
252:7     1 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
252:8     0 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
252:9     3 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
252:10    2 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
252:11    5 UGood  -   7.276 TB SATA HDD N   N  512B ST8000NM023B-2TJ133     U  -    
-------------------------------------------------------------------------------------

EID=Enclosure Device ID|Slt=Slot No|DID=Device ID|DG=DriveGroup
DHS=Dedicated Hot Spare|UGood=Unconfigured Good|GHS=Global Hotspare
UBad=Unconfigured Bad|Sntze=Sanitize|Onln=Online|Offln=Offline|Intf=Interface
Med=Media Type|SED=Self Encryptive Drive|PI=Protection Info
SeSz=Sector Size|Sp=Spun|U=Up|D=Down|T=Transition|F=Foreign
UGUnsp=UGood Unsupported|UGShld=UGood shielded|HSPShld=Hotspare shielded
CFShld=Configured shielded|Cpybck=CopyBack|CBShld=Copyback Shielded
UBUnsp=UBad Unsupported|Rbld=Rebuild

Enclosures = 2

Enclosure LIST :
==============

--------------------------------------------------------------------
EID State Slots PD PS Fans TSs Alms SIM Port# ProdID VendorSpecific 
--------------------------------------------------------------------
251 OK        2  2  0    0   0    0   0 -     BP_PSV                
252 OK       12 12  0    0   0    0   0 -     BP_PSV                
--------------------------------------------------------------------

EID=Enclosure Device ID | PD=Physical drive count | PS=Power Supply count
TSs=Temperature sensor count | Alms=Alarm count | SIM=SIM Count | ProdID=Product ID


BBU_Info :
========

----------------------------------------------
Model State   RetentionTime Temp Mode MfgDate 
----------------------------------------------
BBU   Optimal 0 hour(s)     32C  -    0/00/00 
----------------------------------------------

What about megacli as well? There are still quite a few older servers that use this.

megacli is already covered in the existing sudo rules.

Because no controller found i am unable to create Raids using perccli65

jclark@an-worker1230:~$ perccli64 /c0 add vd each r0 wb ra
CLI Version = 007.1910.0000.0000 Oct 08, 2021
Operating system = Linux 5.10.0-35-amd64
Controller = 0
Status = Failure
Description = Controller 0 not found
jclark@an-worker1230:~$ sudo perccli64 /c0 show

We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:

    #1) Respect the privacy of others.
    #2) Think before you type.
    #3) With great power comes great responsibility.

[sudo] password for jclark:

@elukey We now have need for additional command smartctl. for pulling drive information for Supermicro repairs. Because the Servers use software RAID, the drives are not visible in the Supermicro GUI.

Change #1211653 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Allow smartctl for datacenter-ops

https://gerrit.wikimedia.org/r/1211653

Change #1211653 merged by Muehlenhoff:

[operations/puppet@production] Allow smartctl for datacenter-ops

https://gerrit.wikimedia.org/r/1211653

@Jclark-ctr : You should now be able to run smartctl, let me know if you run into any issues

@MoritzMuehlenhoff not working for me

jclark@ganeti1039:~$ sudo smartctl -H /dev/sda
[sudo] password for jclark:

@MoritzMuehlenhoff what we may need to do is to move all disk/partition/raid/etc.. commands from datacenter-ops to ops-limited, what do you think?