Page MenuHomePhabricator

How to handle mgmt lan for labs bare metal?
Closed, DeclinedPublic

Description

Right now our DC setup only supports a single lan for mgmt everywhere. Ideally we would have a separate vlan for management of labs bare metal nodes.

If we get to Ironic we will definitely need this, as one of Ironic's main features is talking directly to the mgmt console of managed hardware.

Event Timeline

Andrew claimed this task.
Andrew raised the priority of this task from to Medium.
Andrew updated the task description. (Show Details)
Andrew added a project: labs-sprint-118.
Andrew added subscribers: Papaul, Matanya, RobH and 7 others.

I was chatting about this with Andrew. So since all mgmt is on 'dumb' switches, we don't support multiple mgmt vlans unless we also install multiple mgmt switches.

If we do wish to support multiple mgmt vlans, we may want to swap these stupid switches for Juniper switches.

The host OS on a machine can talk to the mgmt controller, and possibly pass commands out of the bare metal VM to other devices over the mgmt network.

If the idea is these physical boxes are totally under the control of the relevant project admins we should consider mimicing access to the mgmt interface. I guess it is in part a question of responsibility demarcation and will probably also shift with Ironic. For now though these boxes shouldn't have anything we care about more than any other labs instance on them.

Andrew set Security to None.

Notes from meeting: Ironic will have its own model, for now the mgmt interface for any labs "hardware" node will be on its own labs mgmt VLAN type of thing. It is not to be included in the standard mgmt vlan for production.

If these exist as bare metal to the OS (that has the userspace the labs user is in) then they have direct hardware access. As such, the mgmt network interface cannot then connect to our typical mgmt network, which is across ALL systems.

The current mangement network is comprised of a single central mgmt switch (juniper) with all the related smart features of said Juniper EX4200, connected to rack level management switches, one per rack. The rack level management switches are 'dumb' and do not allow easy remote configuration of vlans. As such, they cannot be used to split the bare metal labs instances to their own mgmt vlan.

This leaves us two options:

  • Connect the mgmt ports of the bare metal instances to a port on the access (not mgmt) switch and create a vlan for that.
    • Drawback(s): Row B already has a very limited number of network ports available, as many labs systems use bonded interfaces. This also places mgmt traffic on the production switches, which is not in keeping with our current standards. This doesn't scale well.
    • Benefit(s): Immediate gratification & unblocking of T117095. Low short-term cost.
      • Further notes: If eqiad upgrades the switch stacks to mirror codfw, the old EX4200s could be used for this mgmt network upgrade.
  • Setup a mgmt network of vlan capable rack level mgmt switches.
    • Drawback(s): Expensive, very expensive.
    • Benefit(s): Further refinement and security control over the mgmt network.

Do we have any docs about the current setup with promethium (promethium.wikitextexp.eqiad.wmflabs is 10.68.16.2)? I noticed these strange production DNS entries:
promethium.eqiad.wmnet has address 10.64.20.12
promethium.mgmt.eqiad.wmnet has address 10.65.3.144

For now this is totally off the books

If this is totally off the books, can we remove the existing remnants?

Subbu is still using Prometheum. We have half a plan to clean that up but in the meantime we'll need to keep some cruft around.

This task was to make a plan for user mgmt access to bare metal as a service @Dzahn to help clarify, which we have no plans to do.

Got it, thank you both. Yep!