Page MenuHomePhabricator

Consolidate edge bastion server into ganeti
Closed, ResolvedPublic

Description

Things to look into here and various notes:

  • Security - are we ok with ssh bastions inside ganeti alongside other public service instances?
  • Installer issues - DHCP stuff will be ok?
  • Public networking for edge ganetis is configured?

Event Timeline

BBlack triaged this task as Medium priority.Jul 7 2020, 2:42 PM
BBlack created this task.

Security - are we ok with ssh bastions inside ganeti alongside other public service instances?

Sounds fine to me. As long as we have two baremetal bastions in eqiad/codfw which can access all the internal nodes in all edges, I don't see any issue with running the bastions on Ganeti.

Installer issues - DHCP stuff will be ok?

That's sorted, like other instances being installed in Ganeti.

Public networking for edge ganetis is configured?

We have already have edge Ganeti instances with a public IP (e.g. install3001.wikimedia.org), I think the setup is stil WIP; but should be working in general.

Change 655450 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Add bast3005

https://gerrit.wikimedia.org/r/655450

Change 655450 merged by Muehlenhoff:
[operations/puppet@production] Add bast3005

https://gerrit.wikimedia.org/r/655450

Change 656172 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Add bast4003/bast5002

https://gerrit.wikimedia.org/r/656172

Change 656172 merged by Muehlenhoff:
[operations/puppet@production] Add bast4003/bast5002

https://gerrit.wikimedia.org/r/656172

Change 656380 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Make bast4003/bast5002 bastion hosts

https://gerrit.wikimedia.org/r/656380

Change 656380 merged by Muehlenhoff:
[operations/puppet@production] Make bast4003/bast5002 bastion hosts

https://gerrit.wikimedia.org/r/656380

I've created new bastions in Ganeti (bast3005, bast4003, bast5002), which are working fine. I'll send out an announcement to the ops list next week and eventually we can free up the former baremetal servers currently serving as bastions (and reduce our setup for the forthcoming second EU data centre).

I also had a look at the current hardware used for bastions:

  • bast3004: Procured Sep 2019, same specs as DNS/ganeti/LVS servers
  • bast4003: Procured May 2017, same specs as DNS/LVS servers (ganeti more recent)
  • bast5001: Procured Oct 2017, same specs as DNS/LVS servers (ganeti more recent)

We don't really need additional capacity in the cache site Ganeti clusters, so my proposal would be to decom the current hardware bastions and keep the server as spares. They could serve as drop in replacements for the DNS/LVS/Ganeti servers in case of something like a mainboard failure (and we wouldn't even need remote hands).

We actually do have some upcoming projects which might necessitate more Ganeti capacity. In general the plan is to move all the non-ganeti DNS boxes into ganeti as well if possible, and to spin up DoH instances in ganeti everywhere as well (which may turn out to need multiple instances and have real scaling issues). But we don't need more capacity there *now* just yet, and so long as they're kept powered up as online spares, we can always deal with the decision to move them into the cluster at a later time.

We actually do have some upcoming projects which might necessitate more Ganeti capacity. In general the plan is to move all the non-ganeti DNS boxes into ganeti as well if possible, and to spin up DoH instances in ganeti everywhere as well (which may turn out to need multiple instances and have real scaling issues). But we don't need more capacity there *now* just yet, and so long as they're kept powered up as online spares, we can always deal with the decision to move them into the cluster at a later time.

Sounds good. We can also easily integrate those into Ganeti later (with reduced weight in ulsfo/eqsin compared to the other Ganeti nodes)

Change 656894 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Disable bast3004/bast4002/bast5001 as bastions

https://gerrit.wikimedia.org/r/656894

Change 656895 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Update bastions in smokeping config

https://gerrit.wikimedia.org/r/656895

Change 656895 merged by Muehlenhoff:
[operations/puppet@production] Update bastions in smokeping config

https://gerrit.wikimedia.org/r/656895

Change 656894 merged by Muehlenhoff:
[operations/puppet@production] Disable bast3004/bast4002/bast5001 as bastions

https://gerrit.wikimedia.org/r/656894

RobH mentioned this in Unknown Object (Task).Mar 16 2021, 5:44 PM