Page MenuHomePhabricator

cloud: current nova-fullstack mechanism requires cloudcontrol nodes to access individual VMs
Closed, ResolvedPublic

Description

As part of our effort to reduce Cloud NAT exceptions (see parent task and T272395), we discovered that the nova-fullstack mechanism access individual VMs using SSH.

For now we added an ACL exception to allow this, but that's not a long term solution (see T272486).

Some random ideas for a longer term solution:

  • allocate a floating IP and have nova-fullstack use it for SSH. This floating IP is reused for every test.
    • Perhaps we need a couple of floating IPs for flapping resilence...
    • This approach is expensive from the public IPv4 cost point of view
  • Rewrite nova-fullstack to don't do SSH tests
    • or do them using a mechanism other than direct SSH connection (perhaps console access?)
  • Use a CloudVPS bastion as jump host.
    • simple, direct and to the point.

Event Timeline

aborrero triaged this task as Medium priority.Jan 21 2021, 12:10 PM
aborrero created this task.
aborrero removed a project: Epic.
aborrero moved this task from Inbox to Needs discussion on the cloud-services-team (Kanban) board.
aborrero updated the task description. (Show Details)
aborrero updated the task description. (Show Details)
aborrero updated the task description. (Show Details)

Change 660613 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] nova-fullstack test: use a bastion proxy for ssh tests

https://gerrit.wikimedia.org/r/660613

Change 660615 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] nova_fullstack_test.py: move to python3; fix pep8 violations

https://gerrit.wikimedia.org/r/660615

Change 660615 merged by Andrew Bogott:
[operations/puppet@production] nova_fullstack_test.py: move to python3; fix pep8 violations

https://gerrit.wikimedia.org/r/660615

Change 660613 merged by Andrew Bogott:
[operations/puppet@production] nova-fullstack test: use a bastion proxy for ssh tests

https://gerrit.wikimedia.org/r/660613

Change 660617 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] nova-fullstack test: fix the name of the new bastion-ip arg

https://gerrit.wikimedia.org/r/660617

Change 660617 merged by Andrew Bogott:
[operations/puppet@production] nova-fullstack test: fix the name of the new bastion-ip arg

https://gerrit.wikimedia.org/r/660617

Change 660618 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] nova_fullstack: one more python3 fix

https://gerrit.wikimedia.org/r/660618

Change 660618 merged by Andrew Bogott:
[operations/puppet@production] nova_fullstack: one more python3 fix

https://gerrit.wikimedia.org/r/660618

This test now uses a bastion on a public IP. We can now remove the ACL exception, assuming I didn't miss something.

Change 660639 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Update the nova-fullstack monitoring to expect python3

https://gerrit.wikimedia.org/r/660639

Change 660639 merged by Andrew Bogott:
[operations/puppet@production] Update the nova-fullstack monitoring to expect python3

https://gerrit.wikimedia.org/r/660639

Change 660641 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] nova-fullstack: fix puppet cert cleanup check to use bastion

https://gerrit.wikimedia.org/r/660641

Change 660641 merged by Andrew Bogott:
[operations/puppet@production] nova-fullstack: fix puppet cert cleanup check to use bastion

https://gerrit.wikimedia.org/r/660641

Change 660643 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] nova-fullstack: remove test for puppet cert cleanup

https://gerrit.wikimedia.org/r/660643

Change 660643 merged by Andrew Bogott:
[operations/puppet@production] nova-fullstack: remove test for puppet cert cleanup

https://gerrit.wikimedia.org/r/660643

I'm handing this back to @arturo to change the ACL and see what breaks

Change 681316 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/homer/public@master] firewall: cloud-in: drop nova-fullstack term

https://gerrit.wikimedia.org/r/681316

Change 681316 merged by Arturo Borrero Gonzalez:

[operations/homer/public@master] firewall: cloud-in4: drop cloudcontrol-novafullstack term

https://gerrit.wikimedia.org/r/681316