Page MenuHomePhabricator

Netbox: Fix hostname case ambiguity
Closed, ResolvedPublic

Description

Faidon pointed out that some newer hosts with the asset tag pattern (WMFnnnn) as their device name in Netbox have all upper case hostnames, which is in contrast to historical practice. We should normalize on a particular pattern of hostnames.

Actions I will take unless there are any objections:

  • Edit currently existing hosts to have all lowercase device names.
  • Add a coherence test to ensure that hostnames match the pattern [a-z][a-z0-9-]+[0-9] or so.

Details

Related Gerrit Patches:
operations/software/netbox-reports : mastercoherence: Check device names for correct case
operations/software/netbox-reports : mastercoherence: Check device names for correct formatting

Event Timeline

crusnov created this task.Nov 5 2019, 9:33 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 5 2019, 9:33 PM
RobH added a comment.Nov 14 2019, 6:50 PM

hostnames on servers with printed hostname labels are always in lowercase. However, asset tags are all in uppercase on the actual label, as WMF####.

Do we want the 'hostname' field to match the identification exactly? If so, uppercase. If we want them to match all other hostnames and dns, knowing that asset tags are uppercase on the physical label, then lowercase.

I do 100% support having it done consistently across the netbox objects fleet. +1

Dzahn added a subscriber: Dzahn.Nov 14 2019, 6:51 PM

I vote for using lowercase "wmf" because that matches DNS records.

Okay I feel like matching DNS is highly desirable feature.

I have update all hostnames to be lowercase.

Change 550917 had a related patch set uploaded (by CRusnov; owner: CRusnov):
[operations/software/netbox-reports@master] coherence: Check device names for correct formatting

https://gerrit.wikimedia.org/r/550917

Change 550917 merged by CRusnov:
[operations/software/netbox-reports@master] coherence: Check device names for correct formatting

https://gerrit.wikimedia.org/r/550917

Volans added a subscriber: Volans.Nov 19 2019, 1:20 PM

I've reverted the above patch as it was reporting most servers as false positive for the new name coherence report. The regex was wrong and not able to match our current hostnames.
Also is not totally clear to me the goal of this check, as it was opened for the asset tag names specifically but the check was expanded to all hosts.
For asset tag hostnames for example we should check that the hostname matches the asset tag of the same device, and if those are already lowercase, that would be already enough.

I don't think we have a strict policy of hostnames, but if we want the global check we should agree on one first and document it on wikitech.
And if adding a policy it will most likely be more complex that what was proposed in the CR (also if corrected).

Hello, apparently a typo after I had tested occurred (should be * and not ? after the second character set).

However, in general, it was stated to me as a design requirement that all hostnames should be lowercase, and was generally agreed in a discussion on the dcops channel. If there is disagreement in this we should have a conversation about it, otherwise would the corrected regexp ("^[a-z]+[a-z0-9-]*$") be acceptable to you?

Thanks.

I actually think that's not enough if we want to enforce a policy, although it's not clear if that's the scope of this task.

I think the current policy (although not official AFAIK and we should make it official first) roughly is:

  • all new servers should be named with the NNNN suffix
  • hostnames should start with a letter, have letters/hyphen/digits(maybe?) in the middle and end with 4 digits. Sequences of more then one consecutive hyphen should be forbidden and we could even discuss if we want to accept numbers in the name itself before the NNNN suffix.
  • for wmfNNNN names we should enforce that the name matches the asset tag itself
  • existing names with old policy (e.g. acamar) could be reported as warning, or maybe it's too early and we should do it when there is only a bunch left.
  • additional check might be performed, if they are not already, like checking that the NNNN suffix matches the device's site

It was in deference to old hostnames that this regexp is intentionally loose. It simply enforces the policy of the first character must be a lowercase letter, and the rest could be lowercase letters, numbers or hyphens.

So ideally we'd:

  • Check that all hostnames match our standard pattern (something like "^[a-z][a-z0-9]*(-?[a-z0-9]+)*\d{4}$") or are entirely lowercase letters ("^[a-z]+$"), or are "^wmf\d{3,5}$"
  • If the hostname matches wmfNNNN, check the asset tag for a match
  • If the hostname ends in NNNN and isn't a wmf hostname, check the site matches.

If there is agreement, I can implement this very easily.

I don't mind the additional check, but again, I'm not sure how much is in scope for this task. If we do the extensive check then we should define a policy first, that is not strictly defined yet AFAIK.

Things like:

  • no consecutive hypens
  • it's not true that non NNNN hostnames are letters only, we've things like asw2-a1-eqiad
  • not all NNNN servers end with NNNN, like cloudservices2002-dev
  • should we check all devices or only servers? We've things like cablemgmt-eqiad-D5, storagebin-wmf7201, backup2001-array2, etc..

I guess without imposing too much, the ask was "all hostnames are lowercase" - we could just check if lower(hostname) != hostname and call it good, punting the normalization to a further task.

Change 552874 had a related patch set uploaded (by CRusnov; owner: CRusnov):
[operations/software/netbox-reports@master] coherence: Check device names for correct case

https://gerrit.wikimedia.org/r/552874

As discussed on IRC, the above solution is agreed upon. This should not cause any false positives.

Change 552874 merged by CRusnov:
[operations/software/netbox-reports@master] coherence: Check device names for correct case

https://gerrit.wikimedia.org/r/552874

crusnov closed this task as Resolved.Dec 9 2019, 6:24 AM

afaict this is complete. The patch has been merged which performs the check alluded to in the op.