Page MenuHomePhabricator

Interactive consoles?
Closed, ResolvedPublic

Description

We currently have support for pulling console output from labs instances, but no way to write back to the consoles.
At a brief glance there seems to be some sort of support for this in OpenStack, it may be something nice to look into at some point in the future.
https://wikitech.wikimedia.org/wiki/OpenStack#Get_an_interactive_console_to_an_instance_from_a_host suggests this is a possibility for people with access to the labs hosts themselves, but I have no idea how up to date that is

Event Timeline

See also T64847 for technical difficulties Labs ops are experiencing.

Maybe we can merge with T64847 somehow.

On Horizon dashboard, the instances have a Console tab which query:

https://horizon.wikimedia.org/project/instances/<INSTANCE_UUID>/?tab=instance_details__log

That times out after one minute though with the following payload:

<h3>Instance Console</h3>

<p class='alert alert-danger'>console is currently unavailable. Please try again later.
<a class='btn btn-default btn-xs' href="/project/instances/<INSTANCE_UUID>/">Reload</a></p>

Using the CLI:

$ openstack --debug console url show  ci-jessie-wikimedia-61523
DEBUG: keystoneclient.session REQ: curl -g -i -X POST \
http://labnet1002.eqiad.wmnet:8774/v2/contintcloud/servers/<INSTANCE_UUID>/action \
-H "User-Agent: python-novaclient" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "X-Auth-Token: {SHA1}xxxxxxxxxxxxxx" \
-d '{"os-getVNCConsole": {"type": "novnc"}}'
DEBUG: urllib3.connectionpool "POST /v2/contintcloud/servers/<INSTANCE_UUID>/action HTTP/1.1" 500 128
DEBUG: keystoneclient.session RESP:
ERROR: openstack The server has either erred or is incapable of performing the requested operation. (HTTP 500)

The trace:

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/cliff/app.py", line 303, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/dist-packages/cliff/display.py", line 91, in run
    column_names, data = self.take_action(parsed_args)
  File "/usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py", line 115, in take_action
    data = server.get_vnc_console(parsed_args.url_type)
  File "/usr/lib/python2.7/dist-packages/novaclient/v2/servers.py", line 71, in get_vnc_console
    return self.manager.get_vnc_console(self, console_type)
  File "/usr/lib/python2.7/dist-packages/novaclient/v2/servers.py", line 642, in get_vnc_console
    {'type': console_type})[1]
  File "/usr/lib/python2.7/dist-packages/novaclient/v2/servers.py", line 1238, in _action
    return self.api.client.post(url, body=body)
  File "/usr/lib/python2.7/dist-packages/keystoneclient/adapter.py", line 176, in post
    return self.request(url, 'POST', **kwargs)
  File "/usr/lib/python2.7/dist-packages/novaclient/client.py", line 96, in request
    raise exceptions.from_response(resp, body, url, method)

But that one seems to reach the console via VNC. That would require VNC to be installed on the instances..

Looks like one needs to install the openstack-nova-serialproxy service. There are some details on http://blog.oddbit.com/2014/12/22/accessing-the-serial-console-of-your-nova-servers/

The web-based serial console spec is https://specs.openstack.org/openstack/nova-specs/specs/juno/implemented/serial-ports.html apparently implemented in Juno release.

chasemp triaged this task as Medium priority.Apr 4 2016, 1:55 PM

Looks like one needs to install the openstack-nova-serialproxy service. There are some details on http://blog.oddbit.com/2014/12/22/accessing-the-serial-console-of-your-nova-servers/

The web-based serial console spec is https://specs.openstack.org/openstack/nova-specs/specs/juno/implemented/serial-ports.html apparently implemented in Juno release.

Yeah, I also found https://ask.openstack.org/en/question/60328/nova-get-serial-console-produces-http-400-error/

Thing is, it apparently requires nova-serialproxy to be installed somewhere and I'm not sure which host would be most appropriate. In labtest, I did try labtestcontrol2001 but that wants other important (nova-common, nova-conductor, nova-scheduler, python-nova) packages to be upgraded, and even though it's labtest I don't want to potentially leave a broken mess that most likely only other people would know how to clean up.

Change 301177 had a related patch set uploaded (by Yuvipanda):
Add domain labtestspice.wikimedia.org

https://gerrit.wikimedia.org/r/301177

Change 301294 had a related patch set uploaded (by Andrew Bogott):
Set up spice-based remote consoles for Labs instances

https://gerrit.wikimedia.org/r/301294

Proposed: add a root password (managed like a prod password) but also modify policy files so that the Console tab is only visible for people with the admin keystone right.

Proposed: add a root password (managed like a prod password) but also modify policy files so that the Console tab is only visible for people with the admin keystone right.

Yep, discussed in the main ops meeting and we really need this ability since we have several issues under investigation that involve servers without other avenues of access.

For posterity, we are discussing distinctly only "cloud" admin/root access and not project admin/user access at this time. Though the former could be a pilot for the latter it has not been discussed.

Change 301177 merged by Andrew Bogott:
Add domain labtestspice.wikimedia.org

https://gerrit.wikimedia.org/r/301177

bd808 subscribed.

I spent some time messing around with virsh console today and figured out how to get an instance setup so that it actually works:

  • Enable a getty for ttyS1 on the instance (this should be added to the base instance somehow):
    • sudo systemctl enable serial-getty@ttyS1.service
    • sudo systemctl start serial-getty@ttyS1.service
  • Attach to the console from the cloudvirt running the instance:
    • sudo virsh console --devname serial1 i-...
    • The proper 'i-....' value can be found using openstack server show
$ ssh root@cloudcontrol1003.wikimedia.org
$ source ~/novaenv.sh
$ OS_PROJECT_ID=openstack openstack server list |grep openstack-virsh-test
| 884ed5eb-9076-4612-bea9-82cf6d012404 | openstack-virsh-test      | ACTIVE | lan-flat-cloudinstances2b=172.16.6.172               |
$ OS_PROJECT_ID=openstack openstack server show 884ed5eb-9076-4612-bea9-82cf6d012404 | grep instance_name
| OS-EXT-SRV-ATTR:instance_name        | i-00003524                                                |
$ exit
$ ssh cloudvirt1027.eqiad.wmnet
$ sudo virsh console --devname serial1 i-00003524
Connected to domain i-00003524
Escape character is ^]

Debian GNU/Linux 9 openstack-virsh-test ttyS1

openstack-virsh-test login: bd808
Password:
Last login: Tue Jan 22 02:19:18 UTC 2019 from 172.16.1.136 on pts/0
Linux openstack-virsh-test 4.9.0-8-amd64 #1 SMP Debian 4.9.130-2 (2018-10-27) x86_64
Debian GNU/Linux 9.6 (stretch)
The last Puppet run was at Tue Jan 22 03:19:04 UTC 2019 (6 minutes ago).
bd808@openstack-virsh-test:~$

This doesn't solve the original problem from this ticket of giving access to this console to a project admin, but it does get us closer to being able to attach to running instances in the way attempted in T64847: Instance console does not gives output / keystroke access. We have ripped out the code that used to set a password for the root user, so this is just an incremental step at this point. If the instance's networking is broken such that it can not communicate with the LDAP server then we still have no local user/password to get past the login prompt.

(PS: before anyone freaks out, I changed my LDAP password immediately after this test.)

As part of T215211 we now have consoles for most VMs, available via login on the associated cloudvirt. I'm pretty sure this is where we should leave it... adding an equivalent feature via a web UI opens up more security concerns than I really want to tackle.

bd808 lowered the priority of this task from Medium to Lowest.Feb 20 2019, 4:32 PM

@Krenair and I requested the feature for a console. My use case was for someone to be able to attach to an instance. Ideally as a project admin via Horizon, but knowing that #WMCS people are able to do so is enough for me.

If there are security concerns, I would rather just flag this task as resolved (our super admin can reach the console) rather than drag this task at lowest priority for years and years ;-)

If there are security concerns, I would rather just flag this task as resolved (our super admin can reach the console) rather than drag this task at lowest priority for years and years ;-)

The security concern is basically that the solution we came up with leaves an active root shell attached to a getty in the running instance. Other solutions for interactive login on a management port would have required either having a root password set in the /etc/shadow file (which would mean that we have to either generate and store this for each instance individually or use a shared password for all instances) or using LDAP password auth on that getty. The password management for root passwords is an ugly problem to solve in a multi-tenant environment. The LDAP auth solution it would only work if the instance was healthy enough to talk with the LDAP directory. It would also probably mean LDAP passwords transiting the OpenStack tenant networks which we do not completely trust.

It would still be possible to hook up spice web gui to this new always-on root shell in each instance, but we are a bit worried about the security profile of that as well. We would basically have to audit or implicitly trust that the API OpenStack provides for this applies authorization checks to prevent a cross-tenant attack using that interface.

I'm happy to call this complete with the current cli access directly from the cloudvirt hosts. It's not an ideal self-service solution, but it at least does give us some rescue capability for instances.

Thanks for the detailed informations :-]

solution [..] at least does give us some rescue capability for instances.

Which fulfill my original request (ability to connect on an unreachable instance T215211). So yeah I consider it fixed for me, and offering a console to project admins should be declined as "too much hassle to set it up right".