Page MenuHomePhabricator

openstack nova: virsh consoles broken in Ocata
Closed, ResolvedPublic

Description

With the Ocata upgrade I've noticed some changes in how direct console access to VMs behaves. In investigating that I note that new VMs no longer have the /var/lib/nova/instances/<id>/libvirt.xml file present. I don't see anything in the code or docs to indicate that this shouldn't be created in Ocata; rather I see lots of docs implying that VMs won't start at all if that file is missing (which is clearly not the case).

I'm not sure if this is a problem or not, but it seems like it might be a problem.

Event Timeline

I see now that those files are in /etc/libvirt/qemu/ instead. Probably fine, but no idea why they moved

commit 0b1548a988cdded059700bb27963a8149e65480a
Author: Kashyap Chamarthy <kchamart@redhat.com>
Date: Tue Jan 22 15:26:54 2019 +0100

libvirt: Bump MIN_{LIBVIRT,QEMU}_VERSION for "Stein"

In commit 28d337b[1], we advertized that the NEXT_MIN_LIBVIRT and
NEXT_MIN_QEMU_VERSION for "Stein" will be:

    libvirt 3.0.0 and QEMU 2.8.0

Now that we are in the development cycle for "Stein", bump the
MIN_{LIBVIRT,QEMU}_VERSION to the above agreed-upon versions.

As part of this clean-up:

  - Remove the _create_file_device() function and the calls to it; it is
    a "no-op" when 'virtlogd' is available.  As a result of this, in
    _create_consoles_s390x(), this patch entirely removes the "sclplm"
    serial console device — otherwise 'virtlogd', which is now always
    available, will create a duplicate 'pty' devices ("sclplm" and
    "sclp") pointing to the same log, which results in instance creation
    failure; we don't want that.

  - Remove the requirement for extra serial device (added in commit:
    1f65925: "libvirt: virtlogd: use virtlogd for char devices"), that
    "allows access to a Nova instance via `virsh console <guest>`" in
    _create_pty_device() — it is not required.  I also double-checked
    with libvirt and QEMU developer Daniel Berrangé, who said (slightly
    paraphrasing):

        "Nova should not allow `virsh console` to guests behind its back
        at all.  And especially it should not care about `virsh console`
        working with "tcp".  The point of using "tcp" consoles in Nova
        is that it provides tunneling via the Nova "serial console
        server".  You can only have 1 thing connected to a console at a
        time — so if the Nova serial console is present, `virsh console`
        can't be used anyway."

  - The unit test noise is largely mechanical: remove the superflous
    serial device, and lower the index of the devices by 1.

The following version constants (and corresponding tests), that are now
no longer required, will be removed in separate patches:

    MIN_LIBVIRT_PARALLELS_SET_ADMIN_PASSWD,
    MIN_LIBVIRT_POSTCOPY_VERSION, MIN_{LIBVIRT,QEMU}_LUKS_VERSION,
    MIN_QEMU_FILE_BACKED_VERSION, MIN_LIBVIRT_PERF_VERSION

[1] http://git.openstack.org/cgit/openstack/nova/commit/?id=28d337b --
    Pick next minimum libvirt / QEMU versions for "Stein"

Change-Id: I408baef12358a83921c4693b847a692f6c19e36f
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>

diff --git a/nova/tests/unit/virt/libvirt/fakelibvirt.py b/nova/tests/unit/virt/libvirt/fakelibvirt.py
index c335b9d..dbd5a84 100644

  • a/nova/tests/unit/virt/libvirt/fakelibvirt.py

+++ b/nova/tests/unit/virt/libvirt/fakelibvirt.py
@@ -159,9 +159,9 @@ VIR_SECRET_USAGE_TYPE_CEPH = 2
VIR_SECRET_USAGE_TYPE_ISCSI = 3

  1. Libvirt version to match MIN_LIBVIRT_VERSION in driver.py

-FAKE_LIBVIRT_VERSION = 1003001
+FAKE_LIBVIRT_VERSION = 3000000

  1. Libvirt version to match MIN_QEMU_VERSION in driver.py

-FAKE_QEMU_VERSION = 2005000

  - The unit test noise is largely mechanical: remove the superflous
    serial device, and lower the index of the devices by 1.

The following version constants (and corresponding tests), that are now
no longer required, will be removed in separate patches:

    MIN_LIBVIRT_PARALLELS_SET_ADMIN_PASSWD,
    MIN_LIBVIRT_POSTCOPY_VERSION, MIN_{LIBVIRT,QEMU}_LUKS_VERSION,
    MIN_QEMU_FILE_BACKED_VERSION, MIN_LIBVIRT_PERF_VERSION

[1] http://git.openstack.org/cgit/openstack/nova/commit/?id=28d337b --
    Pick next minimum libvirt / QEMU versions for "Stein"

Change-Id: I408baef12358a83921c4693b847a692f6c19e36f
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Andrew renamed this task from openstack nova: libvirt.xml files no longer created to openstack nova: virsh consoles broken in Ocata.Dec 13 2019, 11:02 AM

agetty couldn't open the console, our systemd service trick isn't working. However the file exists.

aborrero@buster-boot-arturo-01:~$ sudo systemctl status getty@ttyS1.service
● getty@ttyS1.service - Getty on ttyS1
   Loaded: loaded (/lib/systemd/system/getty@.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/getty@ttyS1.service.d
           └─puppet-override.conf
   Active: active (running) since Mon 2019-12-16 13:35:35 UTC; 6s ago
     Docs: man:agetty(8)
           man:systemd-getty-generator(8)
           http://0pointer.de/blog/projects/serial-console.html
 Main PID: 7219 (agetty)
    Tasks: 1 (limit: 4699)
   Memory: 260.0K
   CGroup: /system.slice/system-getty.slice/getty@ttyS1.service
           └─7219 /sbin/agetty --autologin root --noclear ttyS1 vt220

Dec 16 13:35:35 buster-boot-arturo-01 systemd[1]: Started Getty on ttyS1.
Dec 16 13:35:35 buster-boot-arturo-01 agetty[7219]: /dev/ttyS1: not a tty
aborrero@buster-boot-arturo-01:~$ file /dev/ttyS1
/dev/ttyS1: character special (4/65)

Change 558284 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Revert "nova.conf ocata: remove [spice] config section"

https://gerrit.wikimedia.org/r/558284

Change 558296 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] cloud base images: enable passwordless login on serial0

https://gerrit.wikimedia.org/r/558296

Change 558284 merged by Andrew Bogott:
[operations/puppet@production] Partially revert "nova.conf ocata: remove [spice] config section"

https://gerrit.wikimedia.org/r/558284

aborrero moved this task from Inbox to Soon! on the cloud-services-team (Kanban) board.

Change 558296 merged by Andrew Bogott:
[operations/puppet@production] cloud base images: enable passwordless login on serial0

https://gerrit.wikimedia.org/r/558296

Change 559014 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] bootstrap-vz buster: rename puppet-overrides.conf to match the stretch filename

https://gerrit.wikimedia.org/r/559014

Change 559014 merged by Andrew Bogott:
[operations/puppet@production] bootstrap-vz buster: rename puppet-overrides.conf to match the stretch filename

https://gerrit.wikimedia.org/r/559014

Andrew claimed this task.

The attached patches get the console working again, although now in serial0 instead of serial1. That introduces some new security concerns which are documented at https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Root_console_access