Skip to content

Conversation

@pkit
Copy link

@pkit pkit commented Dec 6, 2025

During prepare_save we must unconditionally trigger an interrupt to ensure the guest gets notified after restore. The guest may have suppressed notifications, but after snapshot/restore it needs to be woken up regardless.

Fixes #5554

Changes

Fixes a bug where guest would hang indefinitely on interrupts after resume.

Reason

See above.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkbuild --all to verify that the PR passes
    build checks on all supported architectures.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

@pkit
Copy link
Author

pkit commented Dec 6, 2025

I will add some tests soon.

@pkit
Copy link
Author

pkit commented Dec 8, 2025

@dobrac I have improved your tests to iterate until pending ops queue is reproduced.
Now it quite reliably repros in under 10 iterations for me locally.

@codecov
Copy link

codecov bot commented Dec 8, 2025

Codecov Report

❌ Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.23%. Comparing base (d130c7d) to head (4671013).
⚠️ Report is 2 commits behind head on main.

⚠️ Current head 4671013 differs from pull request most recent head 97d23a9

Please upload reports for the commit 97d23a9 to get more accurate results.

Files with missing lines Patch % Lines
...vmm/src/devices/virtio/block/virtio/io/async_io.rs 33.33% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5568      +/-   ##
==========================================
- Coverage   83.24%   83.23%   -0.01%     
==========================================
  Files         277      277              
  Lines       29263    29268       +5     
==========================================
+ Hits        24359    24362       +3     
- Misses       4904     4906       +2     
Flag Coverage Δ
5.10-m5n.metal 83.57% <77.77%> (-0.01%) ⬇️
5.10-m6a.metal 82.91% <77.77%> (-0.01%) ⬇️
5.10-m6g.metal 80.19% <77.77%> (-0.01%) ⬇️
5.10-m6i.metal 83.57% <77.77%> (-0.01%) ⬇️
5.10-m7a.metal-48xl 82.90% <77.77%> (-0.01%) ⬇️
5.10-m7g.metal 80.18% <77.77%> (-0.01%) ⬇️
5.10-m7i.metal-24xl 83.54% <77.77%> (-0.01%) ⬇️
5.10-m7i.metal-48xl 83.54% <77.77%> (-0.01%) ⬇️
5.10-m8g.metal-24xl 80.17% <77.77%> (-0.02%) ⬇️
5.10-m8g.metal-48xl 80.18% <77.77%> (-0.01%) ⬇️
6.1-m5n.metal 83.60% <77.77%> (-0.01%) ⬇️
6.1-m6a.metal 82.94% <77.77%> (-0.02%) ⬇️
6.1-m6g.metal 80.18% <77.77%> (-0.01%) ⬇️
6.1-m6i.metal 83.60% <77.77%> (-0.01%) ⬇️
6.1-m7a.metal-48xl 82.93% <77.77%> (-0.01%) ⬇️
6.1-m7g.metal 80.18% <77.77%> (-0.02%) ⬇️
6.1-m7i.metal-24xl 83.62% <77.77%> (-0.01%) ⬇️
6.1-m7i.metal-48xl 83.61% <77.77%> (-0.02%) ⬇️
6.1-m8g.metal-24xl 80.18% <77.77%> (-0.01%) ⬇️
6.1-m8g.metal-48xl 80.18% <77.77%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

During prepare_save we must unconditionally trigger an interrupt
to ensure the guest gets notified after restore.
The guest may have suppressed notifications, but
after snapshot/restore it needs to be woken up regardless.

Fixes firecracker-microvm#5554

Signed-off-by: Constantine Peresypkin <pconstantine@gmail.com>
@pkit
Copy link
Author

pkit commented Dec 8, 2025

Codecov idea of "coverage" seems incorrect here. Flagging a debug print is not the best use of coverage checks. So, ignored.

@pkit
Copy link
Author

pkit commented Dec 9, 2025

@bchalios @kalyazin I'm not sure how to kick-off the codecov pass again, other than that this one should be ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] When using Async IO Engine pending ops cause resume to freeze

1 participant