Marrying live migration and device assignment

Registered by Elena Zannoni on 2012-05-05

This blueprint has been superseded. See the newer blueprint "Network Virtualization and Lightning Talks" for updated plans.

This session scheduled with "VFIO - are we there yet?":
https://blueprints.launchpad.net/lpc/+spec/lpc2012-virt-vfio

Device assignment has been around for quite some time now in virtualization. It's a nice technique to squeeze as much performance out of your hardware as possible and with the advent of SR-IOV it's even possible to pass a "virtualized" fraction of your real hardware to a VM, not the whole card.

The problem however is that you lose a pretty substantial piece of functionality: live migration.

The most commonly approach used to counterfight this for networking is to pass 2 NICs to the VM. One that's emulated in software and one that's the actually assigned device. It's the guest's responsibility to treat the two as a union and the host needs to be configured in a way that allows packets to float the same way through both paths. When migrating, the assigned device gets hot unplugged and a new one goes back in on the new host. However, that means that we're exposing crucial implementation details of the VM to the guest: it knows when it gets migrated.

Another approach is to do the above, but combine everything in a single guest driver, so it ends up invisible to the guest OS. That quickly becomes a nightmare too, because you need to reimplement network drivers for your specific guest driver infrastructure, at which point you're most likely violating the GPL anyway.

So what if we restrict ourselves to a single NIC type? We could pass in an emulated version of that NIC into our guest, or pass through an assigned device. They would behave the same. That also means that during live migration, we could switch between emulated and assigned modes without the guest even realizing it.

But maybe others have more ideas on how to improve the situation? The less guest intrusive it is, the better the solution usually becomes. And if it extends to storage, it's even better

Required attendees
Peter Waskiewicz
Alex Williamson

Topic Lead: Alexander Graf <email address hidden>
Alexander has been a steady and long time contributor to the QEMU and KVM projects. He maintains the PowerPC and s390x parts of QEMU as well as the PowerPC port of KVM. He tends to become active, whenever areas seem weird enough for nobody else to touch them, such as nested virtualization, mac os virtualization or ahci. Recently, he has also been involved in kicking off openSUSE for ARM. His motto is なんとかなる.

Blueprint information

Status:
Complete
Approver:
None
Priority:
Undefined
Drafter:
None
Direction:
Needs approval
Assignee:
None
Definition:
Superseded
Series goal:
None
Implementation:
Unknown
Milestone target:
None
Completed by
Grant Likely on 2012-07-30

Related branches

Sprints

Whiteboard

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.