Page MenuHomePhabricator

(Need By: TBD) rack/setup/install cloudceph200[123]-dev
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of <enter the FQDN/hostname of the hosts being setup here>

Hostname / Racking / Installation Details

Hostnames: cloudceph2001-dev, cloudceph2002-dev, cloudceph2003-dev
Racking Proposal: These should be in row B, each in separate racks if possible.
Networking/Subnet/VLAN/IP: 2 x 1G ports per server (3 x 2 = 6 ports). One 1G ethernet network connection to the public subnet (wikimedia.org) and one to the private, internal (codfw.wmnet) should be on each host.
Partitioning/Raid: The 2 OS disks should be a software RAID 1, and the 2 data disks are JBOD for use as Ceph OSDs

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

cloudceph2001-dev: rack: B1U3 nic1= ge-1/0/4 nic2= ge-1/0/5

  • - receive in system on procurement task T242136
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

cloudceph2002-dev: B5U9 Nic1= ge-5/0/8 Nic2=ge-5/0/10

  • - receive in system on procurement task T242136
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

cloudceph2003-dev:B8U7 nic1= ge-8/0/10 nic2= ge-8/0/11

  • - receive in system on procurement task T242136
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Related Objects

StatusSubtypeAssignedTask
ResolvedPapaul

Event Timeline

RobH created this task.Apr 21 2020, 6:29 PM
Restricted Application added a project: Operations. · View Herald TranscriptApr 21 2020, 6:29 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
RobH added a parent task: Unknown Object (Task).Apr 21 2020, 6:29 PM
Papaul moved this task from Backlog to Racking Tasks on the ops-codfw board.Apr 27 2020, 2:58 PM
Papaul updated the task description. (Show Details)Apr 30 2020, 4:04 PM
Papaul added a comment.May 1 2020, 8:29 PM

The task says "The 2 OS disks should be a software RAID 1," Can anyone please provide which software RAID1?

Thanks

RobH removed a subscriber: RobH.May 4 2020, 2:38 PM
Papaul updated the task description. (Show Details)May 4 2020, 4:34 PM
Papaul updated the task description. (Show Details)May 5 2020, 8:37 PM
Papaul added a project: Cloud-Services.

You can use the partman config echo partman/standard.cfg partman/raid1-2dev.cfg

This is the same as we have in production. 2 x OS drives using software RAID1, the remaining disks are JBOD with no RAID protection.

Change 595212 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Add mgmt and public DNS entries for cloudceph200[1-3]-dev

https://gerrit.wikimedia.org/r/595212

These servers should mimic the network configuration we have in production:

eth0 (ens2f0np0) on public1-b-codfw network
eth1 (ens2f1np1) on private1-b-codfw network

Papaul added a comment.EditedMay 11 2020, 3:40 PM
[edit interfaces interface-range vlan-public1-b-codfw]
     member ge-1/0/13 { ... }
+    member ge-1/0/4;
[edit interfaces interface-range vlan-private1-b-codfw]
     member xe-2/0/3 { ... }
+    member ge-1/0/5;
[edit interfaces interface-range disabled]
-    member ge-1/0/4;
-    member ge-1/0/5;
[edit interfaces]
+   ge-1/0/4 {
+       description cloudceph2001-dev:eth0;
+   }
+   ge-1/0/5 {
+       description cloudceph2001-dev:eth1;
+   }
[edit interfaces interface-range vlan-public1-b-codfw]
     member ge-1/0/4 { ... }
+    member ge-5/0/8;
[edit interfaces interface-range vlan-private1-b-codfw]
     member ge-1/0/5 { ... }
+    member ge-5/0/10;
[edit interfaces interface-range disabled]
-    member ge-5/0/10;
-    member ge-5/0/8;
[edit interfaces]
+   ge-5/0/8 {
+       description cloudceph2002-dev:eth0;
+   }
+   ge-5/0/10 {
+       description cloudceph2002-dev:eth1;
+   }
[edit interfaces interface-range vlan-public1-b-codfw]
     member ge-5/0/8 { ... }
+    member ge-8/0/10;
[edit interfaces interface-range vlan-private1-b-codfw]
     member ge-5/0/10 { ... }
+    member ge-8/0/11;
[edit interfaces interface-range disabled]
-    member ge-8/0/10;
-    member ge-8/0/11;
[edit interfaces]
+   ge-8/0/10 {
+       description cloudceph2003-dev:eth0;
+   }
+   ge-8/0/11 {
+       description cloudceph2003-dev:eth1;
+   }
Papaul updated the task description. (Show Details)May 11 2020, 3:40 PM
Papaul updated the task description. (Show Details)May 11 2020, 4:01 PM

Change 595212 merged by Papaul:
[operations/dns@master] DNS: Add mgmt and public DNS entries for cloudceph200[1-3]-dev

https://gerrit.wikimedia.org/r/595212

Papaul updated the task description. (Show Details)May 11 2020, 4:02 PM

Change 595652 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/puppet@production] Add cloudceph200[1-3]-dev MAC address, partman with role insetup

https://gerrit.wikimedia.org/r/595652

Change 595652 merged by Papaul:
[operations/puppet@production] Add cloudceph200[1-3]-dev MAC address with role insetup

https://gerrit.wikimedia.org/r/595652

Change 595714 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/puppet@production] Partman: Add cloudceph200[1-3]-dev

https://gerrit.wikimedia.org/r/595714

Change 595714 merged by Papaul:
[operations/puppet@production] Partman: Add cloudceph200[1-3]-dev

https://gerrit.wikimedia.org/r/595714

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cloudceph2001-dev.wikimedia.org

The log can be found in /var/log/wmf-auto-reimage/202005120017_pt1979_947_cloudceph2001-dev_wikimedia_org.log.

Completed auto-reimage of hosts:

['cloudceph2001-dev.wikimedia.org']

Of which those FAILED:

['cloudceph2001-dev.wikimedia.org']
Papaul updated the task description. (Show Details)May 12 2020, 12:41 AM

Change 595738 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/puppet@production] Change doamin from codfw.wmnet to wikimedia.org for cloudceph200[1-3]-dev

https://gerrit.wikimedia.org/r/595738

Change 595738 merged by Papaul:
[operations/puppet@production] Change doamin from codfw.wmnet to wikimedia.org for cloudceph200[1-3]-dev

https://gerrit.wikimedia.org/r/595738

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cloudceph2001-dev.wikimedia.org

The log can be found in /var/log/wmf-auto-reimage/202005120056_pt1979_4885_cloudceph2001-dev_wikimedia_org.log.

Completed auto-reimage of hosts:

['cloudceph2001-dev.wikimedia.org']

and were ALL successful.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cloudceph2002-dev.wikimedia.org

The log can be found in /var/log/wmf-auto-reimage/202005120125_pt1979_9517_cloudceph2002-dev_wikimedia_org.log.

Completed auto-reimage of hosts:

['cloudceph2002-dev.wikimedia.org']

and were ALL successful.

Papaul updated the task description. (Show Details)May 12 2020, 1:53 AM

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cloudceph2003-dev.wikimedia.org

The log can be found in /var/log/wmf-auto-reimage/202005120154_pt1979_13738_cloudceph2003-dev_wikimedia_org.log.

Completed auto-reimage of hosts:

['cloudceph2003-dev.wikimedia.org']

and were ALL successful.

Papaul updated the task description. (Show Details)May 12 2020, 2:26 AM
Papaul closed this task as Resolved.May 12 2020, 2:29 AM

@JHedden this is complete. You just need to add the DNS for the private interface.

Thanks.