Skip to content

Backport of v51.2 fixes (CVE-2026-45782)#155

Merged
phip1611 merged 10 commits into
cyberus-technology:gardenlinuxfrom
phip1611:gardenlinx-backport-fixes
May 19, 2026
Merged

Backport of v51.2 fixes (CVE-2026-45782)#155
phip1611 merged 10 commits into
cyberus-technology:gardenlinuxfrom
phip1611:gardenlinx-backport-fixes

Conversation

@phip1611
Copy link
Copy Markdown
Member

@phip1611 phip1611 commented May 15, 2026

This is a backport of cloud-hypervisor#8221, fixing GHSA-f47p-p25q-83rh / CVE-2026-45782.

I needed minor adjustments to cherry-pick everything. Further, these are 10 instead of 12 commits as I incorporated cargo fmt and cargo clippy changes into the commits that introduced the problems to satisfy the CI checks.

libvirt Pipeline: https://gitlab.cyberus-technology.de/cyberus/cloud/libvirt/-/merge_requests/201/pipelines

@phip1611 phip1611 force-pushed the gardenlinx-backport-fixes branch from f027b99 to 083e85a Compare May 15, 2026 07:02
@phip1611 phip1611 self-assigned this May 15, 2026
@phip1611 phip1611 changed the title xxx net fd debugging Backport of v51.2 fixes (CVE-2026-45782) May 15, 2026
@olivereanderson
Copy link
Copy Markdown

I see that commits ff877f1 and 8b5f921 from cloud-hypervisor#8221 are not listed as commits in your backport. Is that intentional?

@phip1611
Copy link
Copy Markdown
Member Author

phip1611 commented May 15, 2026

8b5f92116370165bae04e2793ee3dd6216574dd9 is not there as I incorporated the changes into the earlier commits where the problems were introduced. This is a consequence of my typical backport flow where I make sure that all checks pass for every commit

Regarding the other one: Let me double-check - thanks. Very well spotted!

@arctic-alpaca
Copy link
Copy Markdown

While that's not strictly required, this part of the changes to vmm/src/serial_manager.rs from cloud-hypervisor@31f3007 is missing too.

@phip1611
Copy link
Copy Markdown
Member Author

While that's not strictly required, this part of the changes to vmm/src/serial_manager.rs from cloud-hypervisor@31f3007 is missing too.

probably a rebase glitch because our patchset touched that part of the code as well :/ thanks for the careful review!

posk-io and others added 10 commits May 15, 2026 18:19
Signed-off-by: Peter Oskolkov <posk@google.com>
(cherry picked from commit f77c6ef)
Signed-off-by: Peter Oskolkov <posk@google.com>
(cherry picked from commit 21bd3ae)
Signed-off-by: Peter Oskolkov <posk@google.com>
(cherry picked from commit b5053ae)
A buggy or malicious guest may write an inappropriate value into
virtqueue's next_avail field. This will result in an error
when iterating over the queue:

https://github.com/rust-vmm/vm-virtio/blob/863837ef863f6880bb8357e60bbac49e72c0844c/virtio-queue/src/queue.rs#L708

but this error is (logged and) ignored if pop_descriptor_chain()
is used:

https://github.com/rust-vmm/vm-virtio/blob/863837ef863f6880bb8357e60bbac49e72c0844c/virtio-queue/src/queue.rs#L583

A reasonable approach, implemented here, is to mark the device as
NEEDS_RESET and ignore further queue events until the guest
reinitializes the device.

How this patch was tested:

Linux kernel was patched to trigger a bad next_avail when the
virtqueue queue counter reaches 5000:

--------------- START OF LINUX KERNEL PATCH ----------
$ git diff
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index b784aab668670..989f2a0c64a77 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -15,6 +15,9 @@
 #include <linux/spinlock.h>
 #include <xen/xen.h>

+
+void virtqueue_kick_always(struct virtqueue *vq);
+
 #ifdef DEBUG
 /* For development, we want to crash whenever the ring is screwed. */
 #define BAD_RING(_vq, fmt, args...)                            \
@@ -677,6 +680,12 @@ static inline int virtqueue_add_split(
                   struct virtqueue *_vq,
         * new available array entries. */
        virtio_wmb(vq->weak_barriers);
        vq->split.avail_idx_shadow++;
+       {
+        if ((vq->split.avail_idx_shadow % 100) == 0)
+            printk(KERN_ERR "avail idx: %d",
+                  (int)vq->split.avail_idx_shadow);
+               if (vq->split.avail_idx_shadow == 5000)
+               vq->split.avail_idx_shadow = 0;
+       }
        vq->split.vring.avail->idx = cpu_to_virtio16(_vq->vdev,
                                      vq->split.avail_idx_shadow);
        vq->num_added++;
@@ -689,6 +698,11 @@ static inline int virtqueue_add_split(
                  struct virtqueue *_vq,
        if (unlikely(vq->num_added == (1 << 16) - 1))
                virtqueue_kick(_vq);

+       {
+               if (unlikely(vq->split.avail_idx_shadow == 0))
+                       virtqueue_kick_always(_vq);
+       }
+
        return 0;

 unmap_release:
@@ -2515,6 +2529,11 @@ bool virtqueue_kick(struct virtqueue *vq)
 }
 EXPORT_SYMBOL_GPL(virtqueue_kick);

+void virtqueue_kick_always(struct virtqueue *vq)
+{
+       virtqueue_kick_prepare(vq);
+       virtqueue_notify(vq);
+}
 /**
  * virtqueue_get_buf_ctx - get the next used buffer
  * @_vq: the struct virtqueue we're talking about.
--------------- END OF LINUX KERNEL PATCH ----------

Then the kernel was booted, and the host pinged until the
nic became unresponsive:

ping -i 0.002 192.168.4.1

Device status was confirmed using

cat /sys/class/net/eth0/device/status

(it was 0x4f).

Then the device was re-initialized:

DEV_NAME=$(basename $(readlink -f /sys/class/net/eth0/device))
echo $DEV_NAME | tee /sys/bus/virtio/drivers/virtio_net/unbind
echo $DEV_NAME | tee /sys/bus/virtio/drivers/virtio_net/bind
ip link set eth0 up

At this point networking became healthly again.

Signed-off-by: Peter Oskolkov <posk@google.com>
(cherry picked from commit 563303b)
Signed-off-by: Peter Oskolkov <posk@google.com>
(cherry picked from commit 8b60b38)
A malicious or buggy guest can violate virtio by making the same
descriptor head available twice before the first chain has been placed
on the used ring. The submit path pushed both chains onto the
VecDeque-backed inflight_requests keyed by head_index, and on completion
find_inflight_request() returned the first linear match. That Request's
complete_async() freed its bounce buffer while the other chain's
io_uring op was still targeting it, producing a use-after-free the
kernel could then scribble into.

Signed-off-by: Dylan Reid <dgreid@fb.com>
(cherry picked from commit 544fa4a)
For non-batch backends execute_async submits the kernel I/O inline
before returning. An early return while processing before inserting in
inflight_requests, meant the request went untracked, the local batch
list was never appended to inflight_requests, even though the request is
pending in the kernel.

To track it, insert into self.inflight_requests as soon as execute_async
returns Ok. The completion path's find_inflight_request now matches the
orphan and the bounce buffer is freed only after the kernel signals it
is done.

Signed-off-by: Dylan Reid <dgreid@fb.com>
(cherry picked from commit fa8acbd)
The bounce buffer for an unaligned descriptor was allocated in
execute_async and leaked on error paths, even though, for the sync
case the kernel already had a pointer to the buffer.

Clean this up by moving ownership of the buffer to the
AlignedOperation type. To make it actually safe, stop stashing a
guest memory pointer for the duration of the op. Instead, save the
guest address and pass guest memory back to the complete function.

Signed-off-by: Dylan Reid <dgreid@fb.com>
(cherry picked from commit 1b8c92dd5e3c0c58316826486ce5ee30eeb71407))
[backport: adapted to stable/v51.x; v51.x has no block/src/request.rs
 split, so the new aligned_operation module is added next to
 block/src/lib.rs and the in tree struct, alloc, free path is
 replaced in place.]
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
submit_batch_requests pushed each BatchRequest into the io_uring SQ in
turn and used `?` to bail on the first push failure.
Leaving the initial SQEs visible to the kernel — but submitter.submit()
was never called, and every other call site in this file gates submit()
behind a preceding sq.push() that now also fails on the full ring.

This could allow a guest to DoS it's own queue or worse if the buffer is
freed early.

Signed-off-by: Dylan Reid <dgreid@fb.com>
(cherry picked from commit ee315d2)
Signed-off-by: Bo Chen <bchen@crusoe.ai>
@phip1611 phip1611 force-pushed the gardenlinx-backport-fixes branch from 083e85a to 93eee38 Compare May 15, 2026 16:23
@phip1611
Copy link
Copy Markdown
Member Author

Hm, I exercised the cherry-pick one more time and now it shows the diff https://github.com/cyberus-technology/cloud-hypervisor/compare/083e85a1bb0e29bec5c3120220548e93a855f3ad..93eee386f01c3846c6fac49e1d17d58a20597959

Not sure what went wrong the first time..

@arctic-alpaca
Copy link
Copy Markdown

While that's not strictly required, this part of the changes to vmm/src/serial_manager.rs from cloud-hypervisor@31f3007 is missing too.

This seems to still be missing.

Do you know why we don't have the driver_awake field in NetEpollHandler in cloud-hypervisor/virtio-devices/src/net.rs? I couldn't find where it was removed.

@phip1611
Copy link
Copy Markdown
Member Author

phip1611 commented May 18, 2026

Do you know why we don't have the driver_awake field in NetEpollHandler in cloud-hypervisor/virtio-devices/src/net.rs? I couldn't find where it was removed.

This was @Coffeeri :

Copy link
Copy Markdown

@olivereanderson olivereanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for back porting the back port 🙂

@phip1611
Copy link
Copy Markdown
Member Author

phip1611 commented May 18, 2026

@arctic-alpaca is there anything missing? I did the chery-pick twice already just to be sure :). Since we did changes to the serial manager ourselves, that might be the reason why the diff looks different.

If you still think I'm missing something, I'm happy to look into this together. WDYT?

@arctic-alpaca
Copy link
Copy Markdown

vmm/src/serial_manager.rs:

-                // If running on an interactive TTY then accept input
-                // SAFETY: trivially safe
-                if unsafe { libc::isatty(libc::STDIN_FILENO) == 1 } =>
-            {
+            // If running on an interactive TTY then accept input.
+            // SAFETY: trivially safe
+           ConsoleOutput::Tty(_) if unsafe { libc::isatty(libc::STDIN_FILENO) == 1 } => {

AFAICT, this is (roughly) the patch that's missing from this PR atm.

@olivereanderson
Copy link
Copy Markdown

vmm/src/serial_manager.rs:

-                // If running on an interactive TTY then accept input
-                // SAFETY: trivially safe
-                if unsafe { libc::isatty(libc::STDIN_FILENO) == 1 } =>
-            {
+            // If running on an interactive TTY then accept input.
+            // SAFETY: trivially safe
+           ConsoleOutput::Tty(_) if unsafe { libc::isatty(libc::STDIN_FILENO) == 1 } => {

AFAICT, this is (roughly) the patch that's missing from this PR atm.

This is just a matter of formatting no?

@arctic-alpaca
Copy link
Copy Markdown

This is just a matter of formatting no?

The comment is moved above the match arm. As I said before, that's not strictly required, but an omitted change from the backported commits. I'm fine with omitting this on purpose, but I don't want to skip any part by accident.

Copy link
Copy Markdown

@arctic-alpaca arctic-alpaca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving so you're not blocked on me approving after you make a decision.

Copy link
Copy Markdown

@arctic-alpaca arctic-alpaca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...

Now actually proving so you're not blocked on me approving after you make a decision.

@phip1611
Copy link
Copy Markdown
Member Author

phip1611 commented May 18, 2026

I'm sure we can merge the PR as is right now. I carefully cherry-picked every commit and made sure our existing changes and the incoming changes are compatible. As I incorporated the cargo style and clippy fixes into the new commits directly, and as we've backported some v52 changes to the v51 branch, some changes can't be cherry-picked as naturally as one would expect.

Regarding the serial_manager.rs: the code of that function has changed non-trivially from v51 to v52. So I can't adapt the changes to our code base and clippy is still fine with it. I hope I didn't miss anything - let's continue 🚀

I'll wait for one more pipeline run, tho.

@phip1611
Copy link
Copy Markdown
Member Author

Pipeline is green 🚀

@phip1611 phip1611 merged commit f833b65 into cyberus-technology:gardenlinux May 19, 2026
12 checks passed
@phip1611 phip1611 deleted the gardenlinx-backport-fixes branch May 19, 2026 05:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants