qemu on the ganeti clusters was upgraded to qemu 2.5 from jessie-backports to mitigate some AIO-related deadlocks/stalls. qemu 2.7 is now available in jessie-backports, we should investigate an update. Changelogs at http://wiki.qemu.org/ChangeLog/2.6 and http://wiki.qemu.org/ChangeLog/2.7
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Revert "Depool poolcounter1001" | operations/mediawiki-config | master | +1 -1 | |
Depool poolcounter1001 | operations/mediawiki-config | master | +1 -1 |
Event Timeline
From a quick look into the Changelogs, 2.7 has nothing backwards incompatible that should worry us, 2.6 does however. Specifically
The aio=native option to "-drive" now requires the cache=none option, instead of silently disabling itself for other cache modes. The newly invalid combination had been warning since QEMU 2.3.
We 've set this in the past both as a workaround for QEMU deadlocking issues as well as for performance reasons. Unfortunately it seems like 2.12 (our ganeti version) does not pass cache=none, neither do future versions.
Possible solutions:
- Patch ganeti to adhere to QEMU 2.6 (this should happen anyway probably and upstreamed but maybe we don't want to wait)
- Set disk_cache=none in all of our clusters. That would be a change from the default of cache=writeback but we are implicitly there anyway as the QEMU 2.6 changelog says.
- Unset disk_aio=native and use the default of threads. That is an option for sure, but might incur some performance penalties
Of the the above, the disk_cache=none seems like the easiest path forward
I 've tested this and indeed it breaks running VMs as expected. I 've patched up ganeti and awaiting review. PR for this is up at https://github.com/ganeti/ganeti/pull/43. In the meantime I am gonna set the task as stalled
Your patch is also missing in the Ganeti version in stretch, let's report it to the Debian BTS so that it can possibly be backported to a stretch point release?
Yes there hasn't been any release since the time of that patch, so let's do that. Filed it in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=881255
With cache=none being set in all cluster for unrelated reasons, this is now unblocked. In the meantime jessie-backports has upgrade to 2.8. Fortunately the changelog[1] does not have any worrying items in it. The upgrade will require a round of VM reboots, but otherwise looks ok. I 'll empty an eqiad ganeti host, upgrade to 2.8 and move a few VMs to it for testing.
Or we could upgrade the Ganeti cluster to stretch? It provides qemu 2.8 out of the box.
I 'd rather not couple the 2 upgrades. Both need to be done of course, but I was thinking first make sure we are ok with 2.8 and then schedule the stretch upgrade
Mentioned in SAL (#wikimedia-operations) [2018-04-20T07:37:56Z] <akosiaris> upgrade qemu on ganeti2006 to 1:2.8+dfsg-3~bpo8+1 and migrate mwdebug2001 to it T150532
mwdebug2001 showed no problems, I 'll proceed with upgrading the entire codfw cluster. A full cluster VM reboot is to follow
Mentioned in SAL (#wikimedia-operations) [2018-04-24T10:39:14Z] <akosiaris> upgrade to qemu 2.8 on codfw ganeti cluster. T150532
Mentioned in SAL (#wikimedia-operations) [2018-04-24T10:39:29Z] <akosiaris> starting a very slow rolling reboot of all VMs on codfw ganeti cluster T150532
Mentioned in SAL (#wikimedia-operations) [2018-04-25T07:05:32Z] <akosiaris> starting a very slow rolling reboot of all VMs on codfw ganeti cluster, row_C nodegroup, excluding poolcounter1001 and puppetdb1001. T150532
Change 428894 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/mediawiki-config@master] Depool poolcounter1001
Change 428896 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/mediawiki-config@master] Revert "Depool poolcounter1001"
Change 428897 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/mediawiki-config@master] Add poolcounter1003 to $wmfAllServices
Mentioned in SAL (#wikimedia-operations) [2018-04-25T12:47:09Z] <akosiaris> reboot puppetdb1001 for T150532
Change 428894 merged by Alexandros Kosiaris:
[operations/mediawiki-config@master] Depool poolcounter1001
Mentioned in SAL (#wikimedia-operations) [2018-04-25T13:19:14Z] <akosiaris@tin> Synchronized wmf-config/ProductionServices.php: depool poolcounter1001 T150532 (duration: 01m 17s)
Mentioned in SAL (#wikimedia-operations) [2018-04-25T13:40:46Z] <akosiaris> reboot poolcounter1001 for T150532
Change 428896 merged by jenkins-bot:
[operations/mediawiki-config@master] Revert "Depool poolcounter1001"
Mentioned in SAL (#wikimedia-operations) [2018-04-25T13:49:48Z] <akosiaris@tin> Synchronized wmf-config/ProductionServices.php: repool poolcounter1001 T150532 (duration: 01m 16s)
Mentioned in SAL (#wikimedia-operations) [2018-04-25T14:34:12Z] <akosiaris> reboot bohrium T150532