Page MenuHomePhabricator

Juniper network device audit - all sites
Closed, ResolvedPublic

Description

This task will cover/track the auditing of all Juniper network devices in our inventory for both support contract info (in netbox, will have new support contract ID for 95% of them once T207198's order is complete) and the location on the Juniper Entitlement Report (many devices show SF office for site, others show wrong sites, each should be updated to the actual location they live at.)

This audit shouldn't start until after the support renewal goes through, so to reduce duplicated efforts/updates.

Event Timeline

RobH triaged this task as High priority.Jan 15 2019, 5:51 PM
RobH created this task.
RobH renamed this task from network device audit to Juniper network device audit - all sites.Jan 15 2019, 5:51 PM

I was looking at FY19-20 CapEx planning and ran an export of the Entitlement Report from Juniper's website. The output is... not very close to the truth. There are serial there that do not match any of our gear, there are devices with serial numbers that do not match anything we have, plus the locations are all weird and wrong...

Is it difficult to fix this? Also, T183479 is probably a subset of this task and should be resolved in favor of this one.

faidon removed a parent task: Unknown Object (Task).
faidon merged a task: Restricted Task.

yep! Juniper has been working on it since a few weeks in ticket: 2019-0408-0694
Based on https://docs.google.com/spreadsheets/d/1tJ-mqN4-g_NyvO24pRERxVTbX6AMe6lMG9YcO2840Vg/edit (Tab Final)

faidon mentioned this in Unknown Object (Task).Apr 30 2019, 6:05 PM

This is almost done. They added all the missing devices and almost fixed the "installed at" addresses, got some of them wrong. I followed up with the correct ones.

Update from IRC: Juniper's install base is actually missing a whole lot of our devices (e.g. only lists 9 EX4300s, out of... 52). @ayounsi is asking them, but this clearly needs more work :(

From Juniper:

I am still in the process of changing the installed base address of the serial numbers given.
Furthermore, for the serial numbers that are not showing in MyJuniper account, please be advised that we are currently having a known issue with data flow in MyJuniper and our IT team are still in the process of fixing it.

I wrote a Netbox report to check against Juniper's installed base ( https://netbox.wikimedia.org/extras/reports/juniper.Juniper/ )
Still in review in https://gerrit.wikimedia.org/r/c/operations/software/netbox-reports/+/539192

From that report, 42 devices and 228 FRUs (inventory items) are missing from Juniper's installed base.

I commented on the Juniper case to get those fixed, including the proper addresses

Next step should be:

  • Wait for Juniper to update their DB with the missing serials (note that they properly show up in the entitlement search tool, so they are aware of it in some ways)
  • Evaluate if the above brings new issues
  • Ask Juniper to delete the devices that we no longer have in the DCs

A lot of back and forth with Juniper, current status is:
test_consistency fails 70
test_missing_device_from_installed_base 4
test_missing_inventory_from_installed_base 183

Emailed them so they do the following fixes:

So first 4 devices are missing
[...]
Secondly, some addresses are incorrect:
[...] (15 devices)

Third, 183 FRUs (routing engines, PSUs, etc...) are missing from the Installed base. Let me know if you need the serial numbers in a different format (eg. a list of them alone).
[...]
Last, what is the process to remove items from the installed base? For example devices that we have decommissioned.

There are also some devices with expired support being taken care of in T234265.

Got an email from a new person on charge of the task. Sent them the updated list of what needs to be updated.

Thanks @ayounsi! Appreciate the follow up. What exactly did you ask them to do in this last communication?

This has been going on for too long, so I think we need a change of strategy :) I'd like to raise this to our vendors and get this fixed once and for all. Is this communication part of the old case? Could you Cc me on your next email?

I was also looking at the report to understand it better; a few (unrelated) notes:

  • More than half of the failures are about power supplies. While it'd be ideal for our and Juniper's view of FRUs to match up, it's clear that this is not easy and we have a case of diminishing returns. It's not material to us, as we don't buy support separately for those, nor we'd get a replacement refused. So I would say to just exclude those from our report entirely and for the foreseeable future. (Anything that matches JPSU, PWR or "Power Supply" in Netbox)
  • There is one real "support missing" error - this has been raised them to our vendor as part of the support contract task.
  • There is another "support missing" one that is in the "refused support" category; I'm not sure how this is best handled.
  • We have a bunch of new errors now, for devices that we've disposed that still show up in our installed base, e.g. all the esams equipment. Also unclear how to handle these best.

Removing the serials:

Hi Jim,

No problem to re-state everything as clear as I can if it's the last one :)

Serials that should have their install base set to Equinix Ashburn (21721 Filigree Court Ashburn VA 20147 USA), that's where they are physically located:
[..]
Serial that should have its install base set to Interxion Amsterdam (Science Park 121 1098 XG Amsterdam The Netherlands), that's where it is physically located:
[...]
Serial that is in production in our infrastructure but is missing from your database:
[...]
FRUs that are in our database (most of them used in production) but missing from your database (and probably should be added):
[...]
serials that we have decommissioned and should be removed from your database (the list is not complete but a good start):
[...]
Thank you!

To which they replied:

Hi and many many thanks Arzhel,
For understanding, cooperation providing requested (again) ☹
All corrected except for following FRUs that are in our database (most of them used in production) but missing from your database (and probably should be added): Can you provide system output for these.
Others are in the database (now), see attached
[...]
Regards, thank YOU !

An export doesn't show the updates yet, but maybe it takes some time for things to sync up.

I'll follow up with them.

I was also looking at the report to understand it better; a few (unrelated) notes:

  • More than half of the failures are about power supplies. While it'd be ideal for our and Juniper's view of FRUs to match up, it's clear that this is not easy and we have a case of diminishing returns. It's not material to us, as we don't buy support separately for those, nor we'd get a replacement refused. So I would say to just exclude those from our report entirely and for the foreseeable future. (Anything that matches JPSU, PWR or "Power Supply" in Netbox)

That works for me!

  • There is one real "support missing" error - this has been raised them to our vendor as part of the support contract task.
  • There is another "support missing" one that is in the "refused support" category; I'm not sure how this is best handled.

I didn't mention the support missing as I think it's tackled in the support contract exchanges.

  • We have a bunch of new errors now, for devices that we've disposed that still show up in our installed base, e.g. all the esams equipment. Also unclear how to handle these best.

See above, if it's as easy as emailing them, it would be great, otherwise they *must* have some kind of process?

Change 566321 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/software/netbox-reports@master] Extend PRODUCT_NAMES_IGNORE

https://gerrit.wikimedia.org/r/566321

If we do abstraction of all the power supplies (that the CR above is for), there are still inconsistencies, but the list is progressively shrinking.
Some are fixed in one of their database, but didn't get reflected in the my.juniper.net portal.
cr2-esams says support missing, while the entitlement tool says support is active
cr3-knams says city missmatch
mr1-esams says missing
many decommissioned devices are still present in my.juniper.net (Maybe we should ignore those?)
many FPCs, MIC, REs, report as "not present in Juniper Installed Base" but some (eg. FPCs) are present in the entitlement tool (because under dedicated warranty)

Once the above CR is merged, I'll send a new "current status" email to Juniper.

Change 566321 abandoned by Ayounsi:
Extend PRODUCT_NAMES_IGNORE

Reason:
moved to netbox-extras

https://gerrit.wikimedia.org/r/566321

Change 566361 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/software/netbox-extras@master] Extend PRODUCT_NAMES_IGNORE

https://gerrit.wikimedia.org/r/566361

Change 566361 merged by Ayounsi:
[operations/software/netbox-extras@master] Extend PRODUCT_NAMES_IGNORE

https://gerrit.wikimedia.org/r/566361

Change 574425 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/software/netbox-extras@master] Juniper report: only log warning if S/N missing from Netbox

https://gerrit.wikimedia.org/r/574425

From JTAC:

To answer your question “For the list of devices/serials that are decommissioned and we don't own anymore, is there a process so they don't show up in that export anymore?”
I can move them to any other account ID (like for instance a Juniper account ID or a Scrap account id), however, regardless if they are decommissioned they are yours/customer’s still, right?
But reading your comment, you/customer do not own them anymore. Can you clarify that? They were owner and have res-old them?

@wiki_willy: Here we're talking to Juniper devices that are not in Netbox anymore. Is the last statement from Juniper correct? "They were owner and have re-sold them?"

I can provide a list of sold network gear from before 2017, otherwise all network gear is still in our sites in storage, even decom gear.

I'm not exactly sure what is being asked for here, please ping me on irc and we can live chat about it!

Change 574425 merged by Ayounsi:
[operations/software/netbox-extras@master] Juniper report: only log warning if S/N missing from Netbox

https://gerrit.wikimedia.org/r/574425

Ok! From https://wikitech.wikimedia.org/wiki/Server_Lifecycle#States I thought that if a device was not in netbox it was not in our possession anymore.
So I'm making them warnings and not criticals so we're aware of them but not blocking a green report.

The "last" thing are FRUs (linecards, REs, etc...) they do show up on the my.juniper.net portal if searched individually, but don't show up in the full export... Juniper is working on it.

Ok! From https://wikitech.wikimedia.org/wiki/Server_Lifecycle#States I thought that if a device was not in netbox it was not in our possession anymore.

That should actually be true - if that's not the case here, it's probably a premature removal. What's done is done now I suppose, but don't assume that's going to happen again in the future and build our tooling around that; it shouldn't.

So I have another task nearly identical to this, T266053. However it is just for active/planned/inventory gear and correcting their locations now that the most recent renewal is completed. This task seems to have shifted towards the monitoring, and also was for corrections done back in 2019, so I've not closed T266053.

Aklapper removed RobH as the assignee of this task.Jul 2 2021, 5:14 AM

Removing task assignee due to inactivity, as this open task has been assigned for more than two years (see emails sent to assignee on May26 and Jun17, and T270544). Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be very welcome!

(See https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.)

ayounsi assigned this task to RobH.

I think we can close that one. @RobH did the audit afaik.