I'm trying to use QEMU's record/replay feature to capture and deterministically replay a Linux guest execution. The recording phase completes successfully and generates a .bin file as expected. However, when I attempt to replay, the guest system hangs during the Linux boot process.
How can I correctly use QEMU's record/replay functionality? Is there a minimal working example or scenario that reliably demonstrates it?
Host OS: Windows 11 (via WSL2)
QEMU version: 10.0.2
Guest kernel: Linux 6.1.5
Root filesystem: BusyBox 1.36.1 (converted to qcow2)
...
Please press Enter to activate this console.
./qemu-system-riscv64 -machine virt \
-smp 1 \
-nographic \
-icount shift=auto,rr=record,rrfile=replay.bin \
-bios path/to/opensbi-riscv64-generic-fw_dynamic.bin \
-kernel path/to/Image \
-append "root=/dev/vda rw console=ttyS0" \
-blockdev driver=file,filename=path/to/rootfs.qcow2,node-name=img-file \
-blockdev driver=qcow2,file=img-file,node-name=img-qcow \
-blockdev driver=blkreplay,image=img-qcow,node-name=hd0 \
-device virtio-blk-device,drive=hd0 \
-net none \
-audio none
./qemu-system-riscv64 -machine virt \
-smp 1 \
-nographic \
-icount shift=auto,rr=replay,rrfile=replay.bin \
-bios path/to/opensbi-riscv64-generic-fw_dynamic.bin \
-kernel path/to/Image \
-append "root=/dev/vda rw console=ttyS0" \
-blockdev driver=file,filename=path/to/rootfs.qcow2,node-name=img-file \
-blockdev driver=qcow2,file=img-file,node-name=img-qcow \
-blockdev driver=blkreplay,image=img-qcow,node-name=hd0 \
-device virtio-blk-device,drive=hd0 \
-net none \
-audio none
qemu-system-riscv64, replay gets stuck at:...
No soundcards found.
qemu-system-aarch64, it hangs at:...
Please press Enter to activate this console.
What might be causing this issue? Any suggestions, examples, or guidance would be greatly appreciated.
There are minimal examples run by the check-tcg and functional test suites.
The very simple check-tcg tests will just run to completion under record and then replay it back.
➜ cd tests/tcg/aarch64-softmmu/
🕙10:48:06 alex@draig:tests/tcg/aarch64-softmmu on for-11.0/maintainers-update [$!?]
➜ make run-memory-replay V=1
timeout -s KILL --foreground 120 /home/alex/lsrc/qemu.git/builds/all/qemu-system-aarch64 -monitor none -display none -chardev file,path=memory-record.out,id=output -icount shift=5,rr=record,rrfile=record.bin -M virt -cpu max -display none -semihosting-config enable=on,target=native,chardev=output -kernel memory
timeout -s KILL --foreground 120 /home/alex/lsrc/qemu.git/builds/all/qemu-system-aarch64 -monitor none -display none -chardev file,path=memory-replay.out,id=output -icount shift=5,rr=replay,rrfile=record.bin -M virt -cpu max -display none -semihosting-config enable=on,target=native,chardev=output -kernel memory
The functional test suite has basic replay tests for most architectures:
➜ ./pyvenv/bin/meson test --setup thorough func-aarch64-replay func-arm-replay func-x86_64-replay func-ppc64-replay func-i386-replay func-ppc-replay
ninja: Entering directory `/home/alex/lsrc/qemu.git/builds/all'
[13/13] Linking target qemu-system-aarch64
1/6 qemu:func-thorough+func-i386-thorough+thorough / func-i386-replay OK 5.63s 1 subtests passed
2/6 qemu:func-thorough+func-x86_64-thorough+thorough / func-x86_64-replay OK 9.03s 1 subtests passed
3/6 qemu:func-thorough+func-aarch64-thorough+thorough / func-aarch64-replay OK 16.34s 1 subtests passed
4/6 qemu:func-thorough+func-ppc64-thorough+thorough / func-ppc64-replay OK 20.17s 2 subtests passed
5/6 qemu:func-thorough+func-arm-thorough+thorough / func-arm-replay OK 21.49s 3 subtests passed
6/6 qemu:func-thorough+func-ppc-thorough+thorough / func-ppc-replay OK 22.70s 2 subtests passed
See the test code for how those are put together. However there are still some places where record/replay falls over so the subsystem still needs some love.