Just closing the loop here, the backend is up for review, but there is apparently a pylint bug preventing CI from passing (or there was last week).
Roger, it might be best to wait on the upgrade until the split to Ganeti is done (maybe early next week) as using Redis was part of the spec for that.
Tue, Jun 18
Mon, Jun 17
Wed, Jun 12
Tue, Jun 11
Fri, Jun 7
okay so this works, mostly, in labs when manually configured to operate against the deployment-prep Swift cluster. Netbox lets me upload images and shows them associated with the object in question - except that viewing fails because they are served from a URL in the swift cluster that is unavailable. We'll look at this part more next week no doubt.
Thu, Jun 6
Wed, Jun 5
Mon, Jun 3
I agree that from the perspective of more closely modelling the devices between the various tools that the domain name for the VC name thing is necessary. I'm not completely clear on how that would make the matching better? Currently the by-serial matching seems to be working correctly, the complexities are mostly in lining up vendor and model information at this point, unless I'm mistaken - and this appears to be approachable either by matching things more loosely or creating a map between what's in LibreNMS and what's in Netbox. Separately, there are only a few inventory items which don't appear to line up, but I believe it's because they are builtin so they are left out of the librenms query.
Alerts are alerting and in production.
Wed, May 29
After a discussion with Faidon, I think the general consensus is that DRAC (and ILO) should be an acceptable termination name for managament interfaces, in addition to, going forward, the normal default being mgmt\d? (enumerated in the case of tehre being multiple interfaces).
Tue, May 28
Hello here is the sample output. There are several inconsistencies that I can see the fix for that I'd already attempted to mitigate (but not successfully apparently) such as devices like Netbox devtype=Juniper EX4600-40F, LibreNMS devtype=Juniper Networks, Inc. ex4600-40f Ethernet Switch, kernel JUNOS 14.1X53-D45.3, Build date: 2017-07-28 01:39:39 UTC Copyright (c) 1996-2017 Juniper Networks, Inc. or Juniper EX4600 where the information is there it's just not lined up the same. Other things seem less obvious, like duplicated serial numbers and similar.
May 22 2019
Just to follow up on this. I did spend some time trying to figure out how to initiate a template-based export from hitting a URL. It seems as though there's no API-way, and hitting the URL endpoint doesn't work with a token authentication as far as I can tell.
I'm definitely in favor or allowing a failed state to basically come from any other state.
May 21 2019
Merged the change and deployed which uses admin_state instead.
May 14 2019
It was pointed out to me that the vendor name in entPhysical is there, so we could hypothetically check that (for inventory items only) - the devices table remains complex.
May 13 2019
After digging and discussing I believe the way forward since the mapping is slightly ... weird between LibreNMS and Netbox:
Hello, process question about this. The current flowchart for states doesn't allow Spare->Failed to happen, so there are some implicit assumptions inside of f or example the PuppetDB netbox report about that (Failed state is expected to be in Puppet since it implicitly comes from a production state). Is it the preference that boxes like this go through a Failed state (and thus never appear in Puppet? Thanks.
May 10 2019
An idea that came up in discussing DNS automation with @ayounsi is to verify interface names match, and/or automate updating interface names from PuppetDB into Netbox.
May 8 2019
May 7 2019
Just a note, admin_down does not seem to indicate anything particular about the machines that is useful to denote in Netbox as far as I can tell? It seems to reflect the *desired* state. To clarify is there any situation where it would not match the op_state within a short period of time? AFAICT it is used to tell ganeti to down or up the machine but I may be incorrect here. I have implemented mirroring the op_state but if we truly do need an extra field for admin_state that'd be useful to know.
The patch seems sane and simple. I concur with this plan fwiw.
May 6 2019
Just to +1 the idea of shipping javamelody to prometheus. Let me know if I can help at all.
May 5 2019
I agree with this approach, and it's what I was pursuing some months ago. I have merged some time back support for uwsgi::app to set its LimitCORE for this very purpose. Putting this into production should be trivial.
May 3 2019
May 2 2019
May 1 2019
Apr 29 2019
Apr 26 2019
Coherence report quality pass is deployed.
Apr 24 2019
I forget where but in digging about this it seems that Puppet will return 503 if it is too busy, there are numerous reports of this (to be clear I don't know if it's puppet itself or an intermediary that returns 503, but the result from the client's perspective is this).
Apr 23 2019
getting the netbox module in the cookbooks will save steps on decoms and probably reimages and installs (which share many procedures); the caveat is that in decoms it will have to prompt as to the state to transition into (decom or spare).
After several conversations with robh, I think we can start looking at the low hanging fruit. For the record all of these processes are mediated by a dynamic, ever changing checklist.
Apr 17 2019
exclude esams from console report
robh requests that the status show up in test_netbox_in_puppetdb
Apr 16 2019
Apr 12 2019
Thanks for refiguring the checklist :)