Mon, Dec 9
I have been working on adapting the current DNS import script to additionally import other DNS entries and create primary IP addresses on hosts in Netbox. This should provide sufficient information in most cases to obtain the FQDN with an extremely simple one liner if the API is already available.
I will complete this as soon as we complete migrating to the netbox-extras repositor.
I will circle back to the discussed changes this week.
afaict this is complete. The patch has been merged which performs the check alluded to in the op.
Thu, Dec 5
Wed, Dec 4
Hello, we will take a look at that. It is possible for this to get done by the deadline, but there are some minor unknowns due to local changes in our puppetmaster. Thanks for the ping.
Tue, Nov 26
I think the general consensus is "spare" or "future" anywhere in the hostname is an error if the host is ACTIVE in netbox.
After a conversation with @Volans an extended ask is having the generator able to add and remove files (eg, override completely the contents of repository, if necessary). This is a bit of an extension of the workflow I'd envisioned but I shall be implementing that now.
I've gone ahead and added the clusters in Netbox.
@Volans what do you mean by "any remaining puppet code" ?
And rechecking the top list, I have removed acme certs, and other miscellany from /etc.
Okay I believe I have removed all traces on netmo1002 and netmon2001:
Mon, Nov 25
As discussed on IRC, the above solution is agreed upon. This should not cause any false positives.
Thanks, I'll go ahead and remove it then.
The non-optional ask on this is complete. I will leave this open to track the gnt* command proxying.
This is resolved.
This appears to be resolved.
Passing to next clinic duty person.
Doing a quick check on netmon2001:
Yep, revisiting the rotation right now. We in any case have not *lost* anything, it is just non-optimal.
This should be resolved, I've spot checked hosts in the af project and they have been running puppet normally.
This should be resolved.
I guess without imposing too much, the ask was "all hostnames are lowercase" - we could just check if lower(hostname) != hostname and call it good, punting the normalization to a further task.
I executed the plan that Riccardo outlined, removed the running ability in the check and switched to running from the management script, which has simplified the code a bit, although the real causes of the timeouts were that Netbox initializes all of the report objects when you query the .all for the reports list, which for accounting, librenms, and puppetdb involve actually accessing a remote service with unpredictable amounts of time involved. I switched the icinga check to .get the report object instead, so we only eat the unpredictability of one report which for the time being appears to be under the 10 second limit. I'm opening an additional ticket to try to defensively restructure the reports so they don't actually access external services unless they are used so to reduce any possibility of this happening (and also reduce the possibility of a broken external service preventing looking at the report list in the interface).
Fri, Nov 22
Until september 2020 seems a reasonable timeframe (the docs say "typically aronud one year").
According to the procedure for this request, end-dates for rechecking access are needed. Do you have an end-date in mind? Otherwise we should be able to add the access.
Hello I have added the key above to the patch and merged it. This means that shortly (within 30 or so minutes) the key should be propagated to the appropriate bastions.
Giving this to the next person on clinic duty. We still need to know the time limits and I believe some other information to complete this process.
Hello! I have created the mailing list as requested.
Also the configurations in /etc/netbox, anything related to deploys in /srv
Hello! The procedure to complete this suggests a time limit, what is the final say on that in this case?
Thu, Nov 21
Okay the additional cable checks are in place, and appear to be correct.
Wed, Nov 20
The above patch should address these issues. It hugely simplifies the nagios check script and also uses the API more efficiently so it shouldn't flap anymore on a failed report.
Some discussion occurred on the CR. Should we detect future and spare anywhere in the hostname? I was under the impression that this is not normal nor should we entrench it as a standard practice by checking for it. (The case of spare being inside the hostname turned out to be, essentially, in error afair).