Page MenuHomePhabricator

(Need by: TBD) rack/setup/install LVS200[7-10]
Closed, ResolvedPublic

Description

This task will track the receiving, racking, setup, and installation of LVS200[7-10]

Racking proposal

serversRackNIC1NIC2NIC3NIC4
LVS2007A2asw-a2asw-b7asw-c7asw-d7
LVS2008B2asw-b2asw-a7asw-c2asw-d2
LVS2009C2asw-c2asw-a2asw-b2asw-d2
LVS2010D2asw-d2asw-a2asw-b2asw-c2

@BBlack since lvs2007 will be racked in the same rack as Lvs200[1-3] and lvs2008 in the same rack as lvs200[4-6] I can re-use the cables/fibers off one of the old LVS if we decommission one of them and for lvs2009 and lvs2010, I can pull new cables/fibers. Let me know what you think.

LVS2007:

  • - receive in system on procurement task T193820
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan) end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run - set to staged in netbox
  • - handoff for service implementation - set to active when performing its service

LVS2008:

  • - receive in system on procurement task T193820
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan) end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run - set to staged in netbox
  • - handoff for service implementation - set to active when performing its service

LVS2009:

  • - receive in system on procurement task T193820
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan) end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run
  • - handoff for service implementation

LVS2010:

  • - receive in system on procurement task T193820
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan) end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run
  • - handoff for service implementation

Wiring progress

serverNIC1NIC2NIC3NIC4
lvs2007completecompletecompletecompelet
lvs2008completecompletecompletecomplete
lvs2009completecompletecompletecomplete
lvs2010completecompletecompletecomplete

Details

Related Gerrit Patches:
operations/puppet : productionlvs: Test BGP in lvs2007
operations/puppet : productionlvs: Add lvs2007 as a high-traffic1 load balancer
operations/puppet : productionlvs: Test BGP in lvs2008
operations/puppet : productionlvs: Add missing mapping of lvs2008 as high-traffic2
operations/puppet : productionlvs: Add lvs2008 as a high-traffic2 load balancer
operations/puppet : productionDHCP: Add MAC address for lvs200[7-8]
operations/puppet : productionlvs: Replace lvs2003 with lvs2009
operations/puppet : productionlvs: Replace lvs2006 with lvs2010
operations/puppet : productionlvs: Enable BGP in lvs2009
operations/puppet : productionlvs: Set up lvs2009 as a low-traffic LVS
operations/puppet : productionlvs: Increase BGP MED on lvs2010 to 102
operations/puppet : productionlvs: Set BGP peers for lvs2010
operations/puppet : productionlvs: Set lvs2010 as a secondary LVS
operations/puppet : productionlvs: Set txqlen for the proper ifaces on lvs2010
operations/puppet : productionlvs: Rename ifaces for lvs2010
operations/puppet : productioninstall_server: Reimage lvs2010 as buster
operations/puppet : productionPartman: Add lvs2007 and lvs2008 to netboot.cfg
operations/dns : masterlvs2007-lvs2010 production DNS entries, all vlans
operations/puppet : productionDHCP: Add MAC address and netboot entries for lvs2009 and lvs2010
operations/dns : masterDNS: Add mgmt & production DNS entries for lvs200[7-10]

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@Papaul you can proceed at will with lvs2009 and lvs2010 because they are not handling production traffic at the moment

@Vgutierrez firmware upgrade complete on both servers

Change 574659 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] install_server: Reimage lvs2010 as buster

https://gerrit.wikimedia.org/r/574659

Change 574659 merged by Vgutierrez:
[operations/puppet@production] install_server: Reimage lvs2010 as buster

https://gerrit.wikimedia.org/r/574659

Change 574714 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Rename ifaces for lvs2010

https://gerrit.wikimedia.org/r/574714

Change 574714 merged by Vgutierrez:
[operations/puppet@production] lvs: Rename ifaces for lvs2010

https://gerrit.wikimedia.org/r/574714

Change 574715 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Set txqlen for the proper ifaces on lvs2010

https://gerrit.wikimedia.org/r/574715

Change 574715 merged by Vgutierrez:
[operations/puppet@production] lvs: Set txqlen for the proper ifaces on lvs2010

https://gerrit.wikimedia.org/r/574715

Change 574737 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Set lvs2010 as a secondary LVS

https://gerrit.wikimedia.org/r/574737

Change 574737 merged by Vgutierrez:
[operations/puppet@production] lvs: Set lvs2010 as a secondary LVS

https://gerrit.wikimedia.org/r/574737

Change 574753 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Set BGP peers for lvs2010

https://gerrit.wikimedia.org/r/574753

Change 574753 merged by Vgutierrez:
[operations/puppet@production] lvs: Set BGP peers for lvs2010

https://gerrit.wikimedia.org/r/574753

Mentioned in SAL (#wikimedia-operations) [2020-02-25T14:30:02Z] <vgutierrez> restart pybal with BGP enabled on lvs2010 - T245984 T196560

Change 574771 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Increase BGP MED on lvs2010 to 102

https://gerrit.wikimedia.org/r/574771

Change 574772 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Set up lvs2009 as a low-traffic LVS

https://gerrit.wikimedia.org/r/574772

Change 574771 merged by Vgutierrez:
[operations/puppet@production] lvs: Increase BGP MED on lvs2010 to 102

https://gerrit.wikimedia.org/r/574771

Change 574772 merged by Vgutierrez:
[operations/puppet@production] lvs: Set up lvs2009 as a low-traffic LVS

https://gerrit.wikimedia.org/r/574772

Change 574783 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Enable BGP in lvs2009

https://gerrit.wikimedia.org/r/574783

RobH removed a subscriber: RobH.Feb 25 2020, 4:11 PM

Change 574783 merged by Vgutierrez:
[operations/puppet@production] lvs: Enable BGP in lvs2009

https://gerrit.wikimedia.org/r/574783

Mentioned in SAL (#wikimedia-operations) [2020-02-25T16:25:06Z] <vgutierrez> enable BGP in lvs2009 - T196560 T245984

@BBlack @Papaul lvs2009 and lvs2010 are now online as secondary load balancers and ready to take over lvs2003 and lvs2006 respectively.

This should be enough to unblock this task and be able to continue with lvs2007 and lvs2008

@Vgutierrez thanks for the update I will wait on @BBlack when he is done decommissioning lvs2003 and lvs2006 to proceed

wiki_willy renamed this task from rack/setup/install LVS200[7-10] to (Need by: TBD) rack/setup/install LVS200[7-10].Feb 26 2020, 1:42 AM

@Papaul I think I can handle that as well. I'll let you know, the first one will be lvs2006

Change 575203 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Replace lvs2006 with lvs2010

https://gerrit.wikimedia.org/r/575203

Change 575203 merged by Vgutierrez:
[operations/puppet@production] lvs: Replace lvs2006 with lvs2010

https://gerrit.wikimedia.org/r/575203

Mentioned in SAL (#wikimedia-operations) [2020-02-27T10:54:30Z] <vgutierrez> replacing lvs2006 with lvs2010 - T196560 T245984

Mentioned in SAL (#wikimedia-operations) [2020-02-27T10:58:45Z] <vgutierrez> stop pybal on lvs2003 to let lvs2010 take the traffic for a little bit - T196560 T245984

Mentioned in SAL (#wikimedia-operations) [2020-02-27T11:02:56Z] <vgutierrez> start pybal on lvs2003 - T196560 T245984

@Papaul lvs2006 is all yours, I've filed T246329

Change 575224 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Replace lvs2003 with lvs2009

https://gerrit.wikimedia.org/r/575224

Change 575224 merged by Vgutierrez:
[operations/puppet@production] lvs: Replace lvs2003 with lvs2009

https://gerrit.wikimedia.org/r/575224

Mentioned in SAL (#wikimedia-operations) [2020-02-27T12:14:48Z] <vgutierrez> replace lvs2003 with lvs2009 - T196560 T245984 T246334

@Papaul same for lvs2003: T246334

Regarding lvs2007 and lvs2008, please update the NICs FW to the same versions as you did for lvs2009 and lvs2010. Thanks!

Papaul updated the task description. (Show Details)Feb 27 2020, 5:52 PM
Papaul claimed this task.Feb 28 2020, 3:52 AM

Firmware upgrade on lvs2008
Before
BIOS Version 1.3.7
iDRAC Firmware Version 3.15.17.15

After
BIOS Version 2.4.8
iDRAC Firmware Version 4.00.00

Change 575565 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/puppet@production] DHCP: Add MAC address for lvs200[7-8]

https://gerrit.wikimedia.org/r/575565

Papaul updated the task description. (Show Details)Feb 28 2020, 4:42 PM

Change 575565 merged by Papaul:
[operations/puppet@production] DHCP: Add MAC address for lvs200[7-8]

https://gerrit.wikimedia.org/r/575565

@Vgutierrez lvs2008 is ready for service will can on lvs2007 on Monday

Change 575826 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Add lvs2008 as a high-traffic2 load balancer

https://gerrit.wikimedia.org/r/575826

Change 575826 merged by Vgutierrez:
[operations/puppet@production] lvs: Add lvs2008 as a high-traffic2 load balancer

https://gerrit.wikimedia.org/r/575826

Script wmf-auto-reimage was launched by vgutierrez on cumin1001.eqiad.wmnet for hosts:

lvs2008.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003020649_vgutierrez_222100_lvs2008_codfw_wmnet.log.

Completed auto-reimage of hosts:

['lvs2008.codfw.wmnet']

and were ALL successful.

Mentioned in SAL (#wikimedia-operations) [2020-03-02T07:22:57Z] <vgutierrez> upgrading NICs FW on lvs2008 - T196560 T203194

@Papaul I had to upgrade the NIC FW on lvs2008

before
vgutierrez@lvs2008:~$ sudo -i ethtool -i ens2f0np0
driver: bnxt_en
version: 1.9.2
firmware-version: 20.6.151.0/pkg 20.06.05.11
after
vgutierrez@lvs2008:~$ sudo -i ethtool -i ens2f0np0
driver: bnxt_en
version: 1.9.2
firmware-version: 214.0.166.0/pkg 21.40.16.60

besides that, everything looks good on lvs2008.

Change 576043 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Add missing mapping of lvs2008 as high-traffic2

https://gerrit.wikimedia.org/r/576043

Change 576043 merged by Vgutierrez:
[operations/puppet@production] lvs: Add missing mapping of lvs2008 as high-traffic2

https://gerrit.wikimedia.org/r/576043

Papaul updated the task description. (Show Details)Mar 2 2020, 3:31 PM
Papaul updated the task description. (Show Details)
Papaul updated the task description. (Show Details)
Papaul updated the task description. (Show Details)Mar 2 2020, 4:11 PM
Papaul closed this task as Resolved.Mar 2 2020, 6:08 PM
Papaul updated the task description. (Show Details)

@Vgutierrez lvs2007 is ready for service.

Change 576287 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Test BGP in lvs2008

https://gerrit.wikimedia.org/r/576287

Change 576287 merged by Vgutierrez:
[operations/puppet@production] lvs: Test BGP in lvs2008

https://gerrit.wikimedia.org/r/576287

Mentioned in SAL (#wikimedia-traffic) [2020-03-03T10:44:45Z] <vgutierrez> replace lvs2002 with lvs2008 - T196560

Change 576328 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Add lvs2007 as a high-traffic1 load balancer

https://gerrit.wikimedia.org/r/576328

Change 576328 merged by Vgutierrez:
[operations/puppet@production] lvs: Add lvs2007 as a high-traffic1 load balancer

https://gerrit.wikimedia.org/r/576328

Script wmf-auto-reimage was launched by vgutierrez on cumin1001.eqiad.wmnet for hosts:

lvs2007.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003031331_vgutierrez_52584_lvs2007_codfw_wmnet.log.

Completed auto-reimage of hosts:

['lvs2007.codfw.wmnet']

and were ALL successful.

Change 576342 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] lvs: Test BGP in lvs2007

https://gerrit.wikimedia.org/r/576342

Change 576342 merged by Vgutierrez:
[operations/puppet@production] lvs: Test BGP in lvs2007

https://gerrit.wikimedia.org/r/576342

Mentioned in SAL (#wikimedia-operations) [2020-03-03T14:42:28Z] <vgutierrez> replace lvs2001 with lvs2007 - T196560