Page MenuHomePhabricator

cr1-eqsin faulty interfaces
Closed, ResolvedPublic


The 4 onboard interfaces on cr1-eqsin (MX104) started to fail shortly after seeing (little) traffic.
This is not a blocker for the initial setup of the site, but is a blocker for sending users to that site.

A case has been opened with Juniper and after extensive troubleshooting they concluded that the midplane (aka, the whole box except linecards, REs and PSUs) needed to be replaced.

Their logistics departments gave us a 2 weeks ETA on Feb. 12th (so Feb. 26th).
I emailed them on Feb. 15th and Feb 19th for an update, with no replies.

Called their support today, they told me it was still with their planing team (aka. no replacement available yet), not possible to talk to that planning team, but one of their supervisors should call me back shortly.

Once we have a tracking#, I'll open a inbound shipping and remote hands task for Equinix to proceed with the replacement.

Instruction should be relatively easy:

  1. Save the output of show interfaces descriptions and show chassis hardware
  2. Downtime monitoring
  3. Label cables (or write down their labels/ports) as well as modules
  4. Power off the box
  5. remove the modules
    1. Replace the REs
    2. Replace the MICs
    3. Replace the Transceivers
    4. Replace the AC Power Supply
  6. Swap the faulty box with the new unit
  7. Re-insert the modules in the same slots and re-cable
  8. Power on the box
  9. Verify good health, matching output with: show interfaces descriptions and show chassis hardware, monitoring, etc.
  10. Re-label the box cr1-eqsin
  11. Re-use the shipping box and included shipping label to send the faulty box to Juniper

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Most recent update was:

We are pushing the delivery by next week if there everything is smooth and no customs clearance issue.

Send on the 24th.

Still asking for a more accurate ETA.

Unit shipped with
Supposed to arrive in Singapore on the 1st, and clear custom 2 days later, for a final ETA of 03-Mar-2018 20:25:00 SGT.
As this is a Saturday 8pm, the most likely ETA at SG3 is Monday the 5th.

Inbound shipping ticket opened with Equinix SG3
Smart hands task added to the batch (not submitted yet).

The shipping company has updated: 05-Mar-2018 18:34:00 SGT Proof of Delivery Rcvd

Mentioned in SAL (#wikimedia-operations) [2018-03-06T04:29:08Z] <bblack> eqsin router maintenance starting soon-ish. all of eqsin will be offline and isn't in production service to begin with. We've tried to downtime all the things, but don't be shocked at spurious alerts! - T187807

Unit replaced by Papaul, all interfaces are up! No alarms.