Page MenuHomePhabricator

rack/setup/install restbase201[3-8].codfw.wmnet
Closed, ResolvedPublic

Description

This task will track the racking, setup, and installation of restbase201[3-8].codfw.wmnet.

Please note this should be treated with high priority, as these are replacing leased hosts restbase200[1-6].codfw.wmnet, which are due back to Farnam in December 2018.

Racking Proposal: Rack in rows B, C, and D. Rack 2 each of these 6 new sytems in each of those rows. Try to avoid sharing 1G racks with more than 2 restbase hosts (not counting 2001-2006 as they are being decommissioned for lease return after these come online.)

restbase2001 : B5
restbase2002 : B8
restbase2003 : C1
restbase2004 : C5
restbase2005 : D1
restbase2006 : D5
restbase2007 : B1
restbase2008 : C1
restbase2009 : D1
restbase2010 : B8
restbase2011 : C1
restbase2012 : D1

restbase2013:

  • - receive in system on procurement task T205092
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private1 vlan for each row)
    • end on-site specific steps
  • - production dns entries added (private1-vlan for each row)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

restbase2014:

  • - receive in system on procurement task T205092
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private1 vlan for each row)
    • end on-site specific steps
  • - production dns entries added (private1-vlan for each row)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

restbase2015:

  • - receive in system on procurement task T205092
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private1 vlan for each row)
    • end on-site specific steps
  • - production dns entries added (private1-vlan for each row)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

restbase2016:

  • - receive in system on procurement task T205092
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private1 vlan for each row)
    • end on-site specific steps
  • - production dns entries added (private1-vlan for each row)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

restbase2017:

  • - receive in system on procurement task T205092
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private1 vlan for each row)
    • end on-site specific steps
  • - production dns entries added (private1-vlan for each row)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

restbase2018:

  • - receive in system on procurement task T205092
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private1 vlan for each row)
    • end on-site specific steps
  • - production dns entries added (private1-vlan for each row)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

Event Timeline

RobH triaged this task as High priority.Nov 15 2018, 6:03 PM
RobH created this task.

@Papaul,

As discussed in irc, this is replacing leased hardware and should be treated with the highest priority. Anything that is a task for non lease hardware & can afford to wait should wait until after this is ready for handoff, since these need to be fully in service before we can decommission and remove/ship back restbase200[1-6].codfw.wmnet.

Once these are ready for installation, you can either do the OS install, or hand off to me for completion (as you may have on-site tasks). I leave that to your judgement, but please let me know ASAP when this is complete/ready for me as I need to coordinate with the Services team regarding these systems.

Thanks!

Change 475939 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Add mgmt and production DNS entries for restbase201[3-8]

https://gerrit.wikimedia.org/r/475939

Change 476038 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/puppet@production] DHCP: Add MAC address entries for restbase201[3-8]

https://gerrit.wikimedia.org/r/476038

Change 476040 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/puppet@production] PARTMAN: Add restbase201[3-8]

https://gerrit.wikimedia.org/r/476040

papaul@asw-b-codfw> show interfaces ge-5/0/5 descriptions
Interface Admin Link Description
ge-5/0/5 up up restbase2013

papaul@asw-b-codfw# run show interfaces ge-8/0/18 descriptions
Interface Admin Link Description
ge-8/0/18 up down restbase2014

papaul@asw-c-codfw# run show interfaces ge-1/0/18 descriptions
Interface Admin Link Description
ge-1/0/18 up up restbase2015

papaul@asw-c-codfw# run show interfaces ge-5/0/28 descriptions
Interface Admin Link Description
ge-5/0/28 up up restbase2016

papaul@asw-d-codfw# run show interfaces ge-1/0/15 descriptions
Interface Admin Link Description
ge-1/0/15 up up restbase2017

papaul@asw-d-codfw# run show interfaces ge-5/0/25 descriptions
Interface Admin Link Description
ge-5/0/25 up up restbase2018

Change 475939 merged by Dzahn:
[operations/dns@master] DNS: Add mgmt and production DNS entries for restbase201[3-8]

https://gerrit.wikimedia.org/r/475939

Change 476040 merged by Dzahn:
[operations/puppet@production] PARTMAN: Add restbase201[3-8]

https://gerrit.wikimedia.org/r/476040

Change 476038 merged by Dzahn:
[operations/puppet@production] DHCP: Add MAC address entries for restbase201[3-8]

https://gerrit.wikimedia.org/r/476038

Change 476819 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/dns@master] Add restbase201[3-8] cassandra instances

https://gerrit.wikimedia.org/r/476819

Change 476819 merged by Filippo Giunchedi:
[operations/dns@master] Add restbase201[3-8] cassandra instances

https://gerrit.wikimedia.org/r/476819

Change 476825 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] restbase: add restbase2013

https://gerrit.wikimedia.org/r/476825

Change 476825 merged by Filippo Giunchedi:
[operations/puppet@production] restbase: add restbase2013

https://gerrit.wikimedia.org/r/476825

Change 476827 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: add rack for restbase2013

https://gerrit.wikimedia.org/r/476827

Change 476827 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: add rack for restbase2013

https://gerrit.wikimedia.org/r/476827

Change 476849 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: add restbase20[4-8] racks

https://gerrit.wikimedia.org/r/476849

Change 476849 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: add restbase20[4-8] racks

https://gerrit.wikimedia.org/r/476849

fgiunchedi added a subscriber: fgiunchedi.

I'll be preparing these hosts for cassandra to be bootstrapped there

Change 476868 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] cassandra: create hints directory

https://gerrit.wikimedia.org/r/476868

Change 476868 merged by Filippo Giunchedi:
[operations/puppet@production] cassandra: create hints directory

https://gerrit.wikimedia.org/r/476868

Change 476888 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: add cassandra jbod device for new restbase hosts

https://gerrit.wikimedia.org/r/476888

Change 476888 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: add cassandra jbod device for new restbase hosts

https://gerrit.wikimedia.org/r/476888

Change 476892 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: add sde4 to restbase2013

https://gerrit.wikimedia.org/r/476892

Change 476892 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: add sde4 to restbase2013

https://gerrit.wikimedia.org/r/476892

Change 476895 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: fix sde data directory path

https://gerrit.wikimedia.org/r/476895

Change 476895 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: fix sde data directory path

https://gerrit.wikimedia.org/r/476895

Mentioned in SAL (#wikimedia-operations) [2018-12-03T08:44:16Z] <godog> bootstrap cassandra-a on restbase2013 - T209615

Change 477216 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: add instances for restbase201[4-8]

https://gerrit.wikimedia.org/r/477216

Mentioned in SAL (#wikimedia-operations) [2018-12-03T12:23:23Z] <godog> bootstrap cassandra-b on restbase2013 - T209615

Change 477288 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] site: add new restbase codfw hardware

https://gerrit.wikimedia.org/r/477288

Change 477288 merged by Filippo Giunchedi:
[operations/puppet@production] site: add new restbase codfw hardware

https://gerrit.wikimedia.org/r/477288

Change 477216 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: add instances for restbase201[4-8]

https://gerrit.wikimedia.org/r/477216

All hosts had their first puppet run done, and restbase2013 is bootstrapping cassandra instances. On the remaining hosts I had to chmod a-x /usr/sbin/cassandra due to T211027: puppet (systemd::service) attempts to start manually masked units and we'll need to restore that one host at a time when bootstrapping time comes.

Mentioned in SAL (#wikimedia-operations) [2018-12-03T18:06:19Z] <godog> bootstrap cassandra-c on restbase2013 - T209615

@fgiunchedi: Can you advise if these are fully online, and if so, can we start to proceed on the decommission-hardware of the older restbase systems via T211070?

@fgiunchedi: Can you advise if these are fully online, and if so, can we start to proceed on the decommission-hardware of the older restbase systems via T211070?

Not yet fully online, we'll assign T211070 to you once the time for decommission comes. I'm expecting mid next week as an ETA for the onsite decom part to be able to start, modulo complications.

Mentioned in SAL (#wikimedia-operations) [2018-12-04T07:49:21Z] <godog> bootstrap cassandra-c on restbase2014 - T209615