Page MenuHomePhabricator

(Need By: 2020-06-12) rack/setup/install WMCS 10G switches
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of two additional 10G switches ordered via T250495 and their cables ordered via T251381.

Racking / Installation Details

The racking and implementation details on how to best implement these switches and their cable mapping will need to be provided by netops.

Hostnames: TBD, suggestions welcome. cloudsw? csw?
Racking Proposal: One will be racked in C8, the other in D5, at DCops convenience. As the two racks will be dedicated to WMCS, the current asw ToRs will be decommissioned as soon as all non WMCS are vacated.
Networking/Subnet/VLAN/IP:
On day 1:
C8 switch needs 1x10G SMF link to asw2-b2-eqiad
C8 needs 1x10G SMF link to asw2-b7-eqiad.
Later on:
C8 needs 2*40G SMF to D5.
C8 needs 1x10G SMF to cr1-eqiad
D5 needs 1x10G SMF to cr2-eqiad
Optics and cabling procurement task to come later.
OS Distro: Junos.

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

C8 switch:

  • - receive in system on procurement task T250495
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - networking setup completed (see racking details for cable mapping)
  • - handoff to netops

D5 switch:

  • - receive in system on procurement task T250495
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - networking setup completed (see racking details for cable mapping)
  • - handoff to netops

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Related Objects

Event Timeline

RobH added parent tasks: Unknown Object (Task), Unknown Object (Task).May 1 2020, 6:28 PM
RobH removed a subscriber: RobH.

Cabling diagram, let me know if something is missing or unclear:


As there is a risk of looping asw2-b-eqiad please leave all interfaces disabled. But it's fine to connect them.

Edit: diagram updated.

Yep, see diagram (minus the typo).
cloudsw1-c8-eqiad
cloudsw1-d5-eqiad

Jclark-ctr updated the task description. (Show Details)
Jclark-ctr added a subscriber: Jclark-ctr.

switches racked in c8 and d5 added to Netbox

wiki_willy renamed this task from (Need By: TBD) rack/setup/install WMCS 10G switches to (Need By: 2020-06-12) rack/setup/install WMCS 10G switches.Jun 8 2020, 8:23 PM

sfp`s arrived today will have fibers finished today

all fibers ran console connected. @RobH serial cables need to be ran and what pinout should use? chris mentioned possibly using existing serial from switch in rack?

@Cmjohnson host are cabled need to be configured.

@ayounsi switches are cabled and powered waiting on configuration

Change 607978 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/dns@master] Add mgmt for cloudsw

https://gerrit.wikimedia.org/r/607978

Change 607978 merged by Ayounsi:
[operations/dns@master] Add mgmt for cloudsw

https://gerrit.wikimedia.org/r/607978

@Jclark-ctr or @Cmjohnson - can one of you doublecheck the s/n's in Netbox? The accounting report says they start with "STA" and Netbox says "TA" so we'll need to confirm which one is accurate.

https://netbox.wikimedia.org/extras/reports/accounting.Accounting/

Thanks,
Willy

It's TA from the switches CLI.

Cool, thanks @ayounsi. I went ahead and fixed it on the accounting spreadsheet. Thanks, Willy

Mentioned in SAL (#wikimedia-operations) [2020-07-02T12:12:34Z] <XioNoX> pre-configure asw2-b-eqiad<->cloudsw1-c8-eqiad - T251632

@Jclark-ctr or @Cmjohnson some information are missing in Netbox about the cabling:
See https://netbox.wikimedia.org/dcim/devices/2686/ and https://netbox.wikimedia.org/dcim/devices/2687/

  • Fibers IDs (optionally color and length as well)
  • Console cables
  • Management cables
  • Optionally power cables

Could you also check that the following doesn't apply to us? (we're running 18.4R2-S4)

We ran into an odd issue on the 18.4R2-S3 flex (haven't tried newer than that yet) where all the link lights stop working correctly (don't show link, link on 4 turns on when 2 is plugged in, nothing at all in many cases, or error indicators when there are none) though in CLI everything is fine, more of a cosmetic issue but annoying.

From https://www.reddit.com/r/Juniper/comments/gronfz/evpn_vxlan_differences_between_junos_revisions/fs05ej6/

@Cmjohnson - I chatted with Arzhel a bit earlier today, and he's going to get these dedicated 10g switches for WMCS in C8 and D5 config'd either later this week or mid-late next week. The stuff happening during the data center failover will be for the switches in row D. Thanks, Willy

Note that you can start connecting the servers to the switches, if needed:

  • cloudcephosd: eth0:cloud-hosts1-eqiad and eth1:cloud-storage1-eqiad
  • cloudvirt: eth0:cloud-hosts1-eqiad and eth1:cloud-instance1-eqiad

And you should be able to provision them.

Also em0, the management ports, don't have their cable info in Netbox.

Mentioned in SAL (#wikimedia-operations) [2020-07-09T12:11:37Z] <XioNoX> enable asw2-b-eqiad:ae3 (to cloudsw1-c8) - T251632

Change 610820 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/software/netbox-extras@master] Reports, add new cloudsw role

https://gerrit.wikimedia.org/r/610820

Change 610820 merged by Ayounsi:
[operations/software/netbox-extras@master] Reports, add new cloudsw role

https://gerrit.wikimedia.org/r/610820

Change 616013 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Add cloudsw1 switches to Icinga

https://gerrit.wikimedia.org/r/616013

Change 616013 merged by Ayounsi:
[operations/puppet@production] Add cloudsw1 switches to Icinga

https://gerrit.wikimedia.org/r/616013

Change 616014 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] cloudsw1: fix typo esams -> eqiad

https://gerrit.wikimedia.org/r/616014

Change 616014 merged by Ayounsi:
[operations/puppet@production] cloudsw1: fix typo esams -> eqiad

https://gerrit.wikimedia.org/r/616014

Is there any reason not to close this? Are there still asset tags or netbox things left to do?

I believe:

Also em0, the management ports don't have their cable info in Netbox.

Is the last thing to do here.

updated em0 for both...resolving

Change 655446 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/homer/public@master] Add new cloudsw switches

https://gerrit.wikimedia.org/r/655446

Change 655446 merged by jenkins-bot:
[operations/homer/public@master] Add new cloudsw switches

https://gerrit.wikimedia.org/r/655446