Comment 1 for bug 2009048

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Test Steps
---

1) Create cloud-init data ISO to initialize cloud images:

 cat >meta-data <<EOF
 instance-id: iid-local01
 local-hostname: qemu-vm
 EOF

 cat >user-data <<EOF
 #cloud-config
 password: passw0rd
 chpasswd: { expire: False }
 ssh_pwauth: True
 EOF

 genisoimage -output cloud-init-data.iso -volid cidata -joliet -rock user-data meta-data

2) Download cloud image, and create larger disk image:

 for RELEASE in kinetic jammy focal bionic; do
   wget https://cloud-images.ubuntu.com/$RELEASE/current/$RELEASE-server-cloudimg-amd64.img

   qemu-img create -f qcow2 -F qcow2 -b $RELEASE-server-cloudimg-amd64.img $RELEASE.qcow2 8G
 done

Now, for each RELEASE:

3) Run QEMU to emulate an AMD IOMMU system, with an audio device to be used for PCI passthrough.

(We'll run QEMU inside it, to verify the changes, as it running in a bare-metal AMD system).

 qemu-system-x86_64 \
   -accel kvm -machine q35 -smp 1 -m 4G \
   -nodefaults -nographic -no-user-config \
   -serial stdio \
   \
   -drive file=$RELEASE.qcow2,if=virtio \
   -drive file=cloud-init-data.iso,if=virtio,read-only=on,driver=raw \
   \
   -netdev user,id=net0,hostfwd=tcp:127.0.0.1:2222-:22 \
   -device virtio-net,netdev=net0 \
   \
   -cpu EPYC-v1 \
   -device amd-iommu \
   -device intel-hda

 ...
 login: ubuntu
 password: passw0rd
 ...
 (watch out for ctrl-c)

 or
 ssh ubuntu@127.0.0.1 -p 2222 # passw0rd

...

4) We'll need qemu-system-x86_64 and virsh/libvirt

 sudo apt update && sudo apt install --yes --no-install-recommends qemu-system libvirt-daemon-system
 logout # login again

5) Check the emulated hardware (and reserved ranges just below 1TiB on all IOMMU groups)

 $ grep 'AMD EPYC' /proc/cpuinfo
 model name : AMD EPYC Processor

 $ lspci | grep -i iommu
 00:02.0 IOMMU: Advanced Micro Devices, Inc. [AMD] Device 0010

 $ grep reserved /sys/kernel/iommu_groups/*/reserved_regions
 /sys/kernel/iommu_groups/0/reserved_regions:0x000000fd00000000 0x000000ffffffffff reserved
 /sys/kernel/iommu_groups/1/reserved_regions:0x000000fd00000000 0x000000ffffffffff reserved
 /sys/kernel/iommu_groups/2/reserved_regions:0x000000fd00000000 0x000000ffffffffff reserved
 /sys/kernel/iommu_groups/3/reserved_regions:0x000000fd00000000 0x000000ffffffffff reserved
 /sys/kernel/iommu_groups/4/reserved_regions:0x000000fd00000000 0x000000ffffffffff reserved

6) Configure the audio device for PCI passthrough

 $ lspci | grep -i audio
 00:03.0 Audio device: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) High Definition Audio Controller (rev 01)

 $ ls /sys/bus/pci/devices/0000:00:03.0/iommu_group/devices/
 0000:00:03.0

 PCI=0000:00:03.0

 sudo modprobe vfio-pci
 echo vfio-pci | sudo tee /sys/bus/pci/devices/$PCI/driver_override

 echo $PCI | sudo tee /sys/bus/pci/devices/$PCI/driver/unbind 2>/dev/null; sleep 1
 echo $PCI | sudo tee /sys/bus/pci/drivers/vfio-pci/bind; sleep 1

 echo 1 | sudo tee /sys/module/vfio_iommu_type1/parameters/allow_unsafe_interrupts

7) Enable memory overcommit to allow early start of VM w/ 1TB RAM:

 echo 1 | sudo tee /proc/sys/vm/overcommit_memory

8) Verify the error with memory size above limit:

 $ sudo qemu-system-x86_64 -nographic -device vfio-pci,host=$PCI -m 1035265
 qemu-system-x86_64: -device vfio-pci,host=0000:00:03.0: VFIO_MAP_DMA failed: Invalid argument
 qemu-system-x86_64: -device vfio-pci,host=0000:00:03.0: vfio 0000:00:03.0: failed to setup container for group 2: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x555f589100d0, 0x100000000, 0xfc00100000, 0x7e5b41400000) = -22 (Invalid argument)

9) Verify no error with memory size within limit:
(the VM proceeds to hit an out-of-memory, expected)

 $ sudo qemu-system-x86_64 -nographic -device vfio-pci,host=$PCI -m 1035264
 [ 189.172048] Out of memory: Killed process 900 (qemu-system-x86) total-vm:1060941712kB, anon-rss:3760888kB, file-rss:2704kB, shmem-rss:0kB, UID:0 pgtables:7528kB oom_score_adj:0
 Killed

10) Similarly with libvirt:

 cat >vm.xml <<EOF
 <domain type='qemu'>
   <name>vm</name>

   <os>PCI=0000:00:03.0
     <type arch='x86_64' machine='q35'>hvm</type>
   </os>

   <memory unit='GiB'>1011</memory>

   <devices>
     <hostdev mode='subsystem' type='pci' managed='yes'>
       <driver name='vfio'/>
       <source>
  <address domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
       </source>
     </hostdev>
   </devices>

 </domain>
 EOF

 $ virsh create vm.xml
 error: Failed to create domain from vm.xml
 error: internal error: qemu unexpectedly closed the monitor: 2023-03-02T13:08:34.125944Z qemu-system-x86_64: -device vfio-pci,host=0000:00:03.0,id=hostdev0,bus=pci.3,addr=0x1: VFIO_MAP_DMA failed: Invalid argument
 2023-03-02T13:08:34.157967Z qemu-system-x86_64: -device vfio-pci,host=0000:00:03.0,id=hostdev0,bus=pci.3,addr=0x1: vfio 0000:00:03.0: failed to setup container for group 2: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x5651a01a1d40, 0x100000000, 0xfc40000000, 0x7ec3e1400000) = -22 (Invalid argument)

 $ sed s/1011/1010/ -i vm.xml

 $ virsh create vm.xml
 [ 1759.875400] Out of memory: Killed process 2989 (qemu-system-x86) total-vm:1059950088kB, anon-rss:3728948kB, file-rss:2576kB, shmem-rss:0kB, UID:64055 pgtables:7472kB oom_score_adj:0
 error: Failed to create domain from vm.xml
 error: internal error: qemu unexpectedly closed the monitor