Guest hang on reboot after migration from bionic to focal

Bug #1896751 reported by Markus Schade
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Medium
Unassigned

Bug Description

[Impact]

 * Upstream skipped rom resets which turned out to affect the
   rebootability of guests migrated onto the new systems.

 * backport a fix of 5.0 to Focal to fix issues with those guests.

[Test Case]

1. spawn guest on bionic
 $ uvt-kvm create --password ubuntu testguest arch=amd64 release=bionic label=daily
2. migrate it over to a focal system
 $ virsh migrate --live testguest qemu+ssh://10.102.141.223/system
3. check on focal if the guest arrived e.g. log in
 $ virsh console testguest
   testguest login: ubuntu
   Password:
4. reboot the guest
 ubuntu@testguest:~$ sudo reboot

Without the fix it will hang, with the fix the reboot succeeds

[Regression Potential]

 * The area of guest start/restart is the most likely place for any
   unexpected effects to happen. We will run the tests mentioned above
   plus regression tests.

[Other Info]

 * added this kind of test to the TODO list of the regression tests
 * If affected you can migrate the guest even Focal->fixed-Focal and then
   it should be able to restart

---

When migrating a guest from a bionic host to a focal host, the guest may hang during reboot. This is due to an upstream optimization that skips ROM reset on incoming migrations.

Full details and patch in:

https://github.com/qemu/qemu/commit/5073b5d3ea303d37f4a8e2ea451d7a2eb1817448

https://bugzilla.redhat.com/show_bug.cgi?id=1809380

Related branches

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This is in v5.0.0 so fixed in Groovy.
But the offending patch is in Focal so there we need a fix.

@Markus - is there anything needed to reproduce (particular guest config) other than migrating from B->F and then rebooting?

Changed in qemu (Ubuntu Focal):
status: New → Triaged
Changed in qemu (Ubuntu):
status: New → Fix Released
Changed in qemu (Ubuntu Focal):
importance: Undecided → Medium
Revision history for this message
Markus Schade (lp-markusschade) wrote :

That's pretty much it. We triggered this fairly reliable as the qemu versions in bionic and focal pretty much match the ones in the original bug report. They were gone after we added the mentioned patch.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Without the fix:
1. spawn guest on bionic
 $ uvt-kvm create --password ubuntu testguest arch=amd64 release=bionic label=daily
2. migrate it over to a focal system
 $ virsh migrate --live testguest qemu+ssh://10.102.141.223/system
3. check on focal if the guest arrived e.g. log in
 $ virsh console testguest
   testguest login: ubuntu
   Password:
4. reboot the guest
 ubuntu@testguest:~$ sudo reboot

This will hang at
         Starting Reboot...
[ 178.033556] reboot: Restarting system

Without the fix it is stuck there, I verified off PPA https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4161 that the fix helps as expected.

I do usually test "virsh shutdown" restart cycles.
This seems to indicate it might be worth to add
a) sudo reboot cycles
b) sudo reboot cycles after migration
This might excercise quite some code paths ...

description: updated
Changed in qemu (Ubuntu Focal):
status: Triaged → In Progress
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

SRU Template for qemu added and MP linked to fix this in Ubuntu 20.04

Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello Markus, or anyone else affected,

Accepted qemu into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:4.2-3ubuntu6.7 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in qemu (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (qemu/1:4.2-3ubuntu6.7)

All autopkgtests for the newly accepted qemu (1:4.2-3ubuntu6.7) for focal have finished running.
The following regressions have been reported in tests triggered by the package:

casper/1.445.1 (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/focal/update_excuses.html#qemu

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Migrated bionic to focal
  $ virsh migrate --live testguest3 qemu+ssh://10.102.141.223/system
Then on Focal rebooting from "inside" the guest
  $ virsh console testguest3
  login and issue "sudo reboot"

1. hanging without the fix

2. upgraded to -proposed
This was actually harder than it should be, while https://launchpad.net/ubuntu/+source/qemu/1%3A4.2-3ubuntu6.7/+publishinghistory says published for 16 hours I can't "see" it in focal proposed.
I see other things in proposed.
I was checking with the teams and there was indeed an issue to resolve.
Until then I'll fetch and use the packages from focal-proposed directly (without apt)
https://launchpad.net/ubuntu/+source/qemu/1:4.2-3ubuntu6.7/+build/20089312

3. tested again now working
...
[ OK ] Reached target Final Step.
         Starting Reboot...
         Stopping Monitoring of LVM2 mirrors…ng dmeventd or progress polling...
[ 618.542301] reboot: Restarting system
[ 0.000000] Linux version 4.15.0-118-generic (buildd@lgw01-amd64-039) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #119-Ubuntu SMP Tue Sep 8 12:30:01 UTC 2020 (Ubuntu 4.15.0-118.119-generic 4.15.18)
...

For some extra safety I also checked reboot of a Focal spawned guest - that worked as well.

Setting verified

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:4.2-3ubuntu6.7

---------------
qemu (1:4.2-3ubuntu6.7) focal; urgency=medium

  * d/p/ubuntu/lp-1882774-*: add newer EPYC processor types (LP: #1887490)
  * d/p/u/lp-1896751-exec-rom_reset-Free-rom-data-during-inmigrate-skip.patch:
    fix reboot after migration (LP: #1896751)
  * d/p/u/lp-1849644-io-channel-websock-treat-binary-and-no-sub-protocol-.patch:
    fix websocket compatibility with newer versions of noVNC (LP: #1849644)

 -- Christian Ehrhardt <email address hidden> Mon, 27 Jul 2020 11:45:26 +0200

Changed in qemu (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for qemu has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.