Hi All,
I have a number of KVM servers hosting a few dozen VMs. I try to keep CPU and hardware as close as possible, but recently I added a 5800x CPU to the mix running ubuntu 22 LTS. All was working well, but sometime on the latest kernel upgrade, migrations between the 5800x based server and the 3700X based servers started to fail with the guest freezing or crashing. I setup a pair of test servers in the back. Both Ubuntu 22 LTS, one on a Ryzen 2700 and one on a Ryzen 5800X. I can migrate guest VMs TO the 5800 without issue. They run just fine. But as soon as I try to migrate the VM back to the Ryzen 2700, the guest VM once its fully migrated just freezes or crashes.
e.g. on the 5800x based server I do
which all looks good and fine
but on the destination machine (virtboxtest2), the resultant VM is frozen. In virtmanager, CPU is pegged at 100%. On my FreeBSD test VM, I eventually get a panic if I touch the console
results are the same if I dont do a live migration
I dont see much in the system's syslog which is receiving the guest VM
not sure if that is normal or not ? Like i said, it was working OK a few weeks ago.
I have a number of KVM servers hosting a few dozen VMs. I try to keep CPU and hardware as close as possible, but recently I added a 5800x CPU to the mix running ubuntu 22 LTS. All was working well, but sometime on the latest kernel upgrade, migrations between the 5800x based server and the 3700X based servers started to fail with the guest freezing or crashing. I setup a pair of test servers in the back. Both Ubuntu 22 LTS, one on a Ryzen 2700 and one on a Ryzen 5800X. I can migrate guest VMs TO the 5800 without issue. They run just fine. But as soon as I try to migrate the VM back to the Ryzen 2700, the guest VM once its fully migrated just freezes or crashes.
e.g. on the 5800x based server I do
Code:
$
virsh migrate --verbose --unsafe --live 13Test qemu+ssh://virtboxtest2/system
13Test
Migration: [100 %]
$
but on the destination machine (virtboxtest2), the resultant VM is frozen. In virtmanager, CPU is pegged at 100%. On my FreeBSD test VM, I eventually get a panic if I touch the console
Code:
Fatal trap 9: general protection fault while in kernel modecpuid = 0; apic id = 00instruction pointer = 0x20:0xffffffff8108c2ffstack pointer = 0x28:0xfffffe00a80f19b8frame pointer = 0x28:0xfffffe00a80f19b8code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1processor eflags = resume, IOPL = 0current process = 679 (sendmail)trap number = 9timeout stopping cpuspanic: general protection faultcpuid = 0time = 1653511702KDB: stack backtrace:#0 0xffffffff80c69465 at kdb_backtrace+0x65#1 0xffffffff80c1bb1f at vpanic+0x17f#2 0xffffffff80c1b993 at panic+0x43#3 0xffffffff810afdf5 at trap_fatal+0x385#4 0xffffffff81087528 at calltrap+0x8#5 0xffffffff8108c775 at restore_fpu_curthread+0x85#6 0xffffffff810854f1 at done_load_dr+0x31#7 0xffffffff80c28892 at mi_switch+0xc2#8 0xffffffff80c790e6 at sleepq_catch_signals+0x2e6#9 0xffffffff80c79312 at sleepq_timedwait_sig+0x12#10 0xffffffff80bafbdd at _cv_timedwait_sig_sbt+0x10d#11 0xffffffff80c8ab35 at seltdwait+0x75#12 0xffffffff80c8a62e at kern_select+0x92e#13 0xffffffff80c8a9d6 at sys_select+0x56#14 0xffffffff810b06ec at amd64_syscall+0x10c#15 0xffffffff81087e3b at fast_syscall_common+0xf8Uptime: 4m8s
Code:
virtboxtest3:~$ virsh migrate --verbose 13Test qemu+ssh://virtboxtest2/system
Migration: [100 %]
virtboxtest3:~$
Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0xffffffff8108c2ff
stack pointer = 0x28:0xfffffe0068d469e8
frame pointer = 0x28:0xfffffe0068d469e8
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = resume, IOPL = 0
current process = 582 (syslogd)
trap number = 9
timeout stopping cpus
panic: general protection fault
cpuid = 0
time = 1653512644
KDB: stack backtrace:
#0 0xffffffff80c69465 at kdb_backtrace+0x65
#1 0xffffffff80c1bb1f at vpanic+0x17f
#2 0xffffffff80c1b993 at panic+0x43
#3 0xffffffff810afdf5 at trap_fatal+0x385
#4 0xffffffff81087528 at calltrap+0x8
#5 0xffffffff8108c775 at restore_fpu_curthread+0x85
#6 0xffffffff810854f1 at done_load_dr+0x31
#7 0xffffffff80c28892 at mi_switch+0xc2
#8 0xffffffff80c790e6 at sleepq_catch_signals+0x2e6
#9 0xffffffff80c78de9 at sleepq_wait_sig+0x9
#10 0xffffffff80baf75c at _cv_wait_sig+0xec
#11 0xffffffff80c8ab5d at seltdwait+0x9d
#12 0xffffffff80c8a62e at kern_select+0x92e
#13 0xffffffff80c8a9d6 at sys_select+0x56
#14 0xffffffff810b06ec at amd64_syscall+0x10c
#15 0xffffffff81087e3b at fast_syscall_common+0xf8
Uptime: 15m6s
Automatic reboot in 15 seconds - press a key on the console to abort
Code:
May 25 21:09:11 virtboxtest2 kernel: [1230622.159602] audit: type=1400 audit(1653512951.197:164): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-a51b968e-424e-4f20-a60b-9f332529b90a" pid=51349 comm="apparmor_parser"
May 25 21:09:11 virtboxtest2 kernel: [1230622.276846] audit: type=1400 audit(1653512951.313:165): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-a51b968e-424e-4f20-a60b-9f332529b90a" pid=51352 comm="apparmor_parser"
May 25 21:09:11 virtboxtest2 kernel: [1230622.406953] audit: type=1400 audit(1653512951.441:166): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-a51b968e-424e-4f20-a60b-9f332529b90a" pid=51356 comm="apparmor_parser"
May 25 21:09:11 virtboxtest2 kernel: [1230622.536461] audit: type=1400 audit(1653512951.573:167): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-a51b968e-424e-4f20-a60b-9f332529b90a" pid=51360 comm="apparmor_parser"
May 25 21:09:11 virtboxtest2 networkd-dispatcher[911]: WARNING:Unknown index 22 seen, reloading interface list
May 25 21:09:11 virtboxtest2 systemd-udevd[51364]: Using default interface naming scheme 'v249'.
May 25 21:09:11 virtboxtest2 kernel: [1230622.694004] audit: type=1400 audit(1653512951.729:168): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-a51b968e-424e-4f20-a60b-9f332529b90a" pid=51374 comm="apparmor_parser"
May 25 21:09:11 virtboxtest2 systemd[1]: Started Virtual Machine qemu-15-13Test.
May 25 21:09:12 virtboxtest2 systemd-networkd[884]: macvtap13: Link UP
May 25 21:09:12 virtboxtest2 systemd-networkd[884]: macvtap13: Gained carrier
May 25 21:09:12 virtboxtest2 kernel: [1230623.752178] audit: type=1400 audit(1653512952.789:169): apparmor="DENIED" operation="open" profile="libvirt-a51b968e-424e-4f20-a60b-9f332529b90a" name="/etc/ssl/openssl.cnf" pid=51377 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=64055 ouid=0
May 25 21:09:13 virtboxtest2 systemd[1]: session-397.scope: Deactivated successfully.
May 25 21:09:14 virtboxtest2 systemd-networkd[884]: macvtap13: Gained IPv6LL
May 25 21:09:14 virtboxtest2 ModemManager[48708]: <info> [base-manager] couldn't check support for device '/sys/devices/pci0000:00/0000:00:01.3/0000:03:00.2/0000:20:01.0/0000:23:00.0': not supported by any plugin