Page MenuHomePhabricator

codfw: kubernetes200[1-4] racking and onsite setup task
Closed, ResolvedPublic

Description

This task will outline where the new kubernetes systems (ordered on T161723) should be racked and setup.

Racking location
kubernetes2001 Row A Rack A5
kubernetes2002 Row A Rack B5
kubernetes2003 Row A Rack C5
kubernetes2004 Row A Rack D5

  • receive in and attach packing slip to parent task T161723
  • rack systems, update racktables
  • create mgmt dns entries (both asset tag and hostname)
  • create production dns entries (internal vlan)
  • update/create sub task with network port info for all new hosts
  • install_server module update (mac address and partitioning info, partition = docker-host.cfg)
  • install os
  • puppet/salt accept
  • hand off to @akosiaris for service implementation.

@akosiaris please review and see if there is anything I am missing. Thanks.

Event Timeline

Papaul triaged this task as Medium priority.May 9 2017, 5:27 PM
Papaul updated the task description. (Show Details)

Change 352869 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Add mgmt and production DNS entries for kubernetes200[1-4]

https://gerrit.wikimedia.org/r/352869

Change 352869 merged by Dzahn:
[operations/dns@master] DNS: Add mgmt and production DNS entries for kubernetes200[1-4]

https://gerrit.wikimedia.org/r/352869

Change 353098 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/puppet@production] DHCP/partman: Add dhcp and partman entries for kubernetes200[1-4]

https://gerrit.wikimedia.org/r/353098

Change 353098 merged by Dzahn:
[operations/puppet@production] DHCP/partman: Add dhcp and partman entries for kubernetes200[1-4]

https://gerrit.wikimedia.org/r/353098

@akosiaris i am getting this while trying to install the systems

┌────────────────────┤ [!!] Partition disks ├──────────────────┐
│                                                              │
│              Failed to partition the selected disk           │
│ This happened because the selected recipe does not contain an│
│ partition that can be created on LVM volumes.                │
│                                                              │
│     <Go Back>                                    <Continue>  │
│                                                              │
└──────────────────────────────────────────────────────────────┘
RobH added a parent task: Unknown Object (Task).May 15 2017, 3:05 PM

@Papaul the mistake was clearly in the partman recipe. Fixed in https://gerrit.wikimedia.org/r/#/c/354209/ and the boxes are up and running and have been installed fine.

Change 354230 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Assign roles to kubernetes200X hosts

https://gerrit.wikimedia.org/r/354230

Change 354230 merged by Alexandros Kosiaris:
[operations/puppet@production] Assign roles to kubernetes200X hosts

https://gerrit.wikimedia.org/r/354230

And hosts are now fully up and running, will resolve this. Thanks @Papaul

akosiaris updated the task description. (Show Details)