Skip to content

Bump patchset: v51 -> v52#153

Merged
phip1611 merged 178 commits into
cyberus-technology:gardenlinux-next-v52-basefrom
phip1611:gardenlinux-next-v52
May 20, 2026
Merged

Bump patchset: v51 -> v52#153
phip1611 merged 178 commits into
cyberus-technology:gardenlinux-next-v52-basefrom
phip1611:gardenlinux-next-v52

Conversation

@phip1611
Copy link
Copy Markdown
Member

@phip1611 phip1611 commented Apr 30, 2026

This series bumps the gardenlinux Cloud Hypervisor patchset onto the recently released v52.

You can find an overview of the difficulties during the rebase in this outline document (trivial patches, hard to rebase patches, patches that are now upstream...).

From ~250 commits we have in the current gardenlinux branch, we are now down to ~170.

Changes & Hints for Reviewers

  • libvirt pipeline run: https://gitlab.cyberus-technology.de/cyberus/cloud/libvirt/-/merge_requests/194/pipelines
  • The original commits that are still there, exist with the same name in the old gardenlinux branch (minor exceptions might be possible)
    • with a few minor exceptions to streamline commits that belong to a series
  • I reordered the patchset quite significantly: small standalone commits are mostly moved to the beginning where it makes sense, followed by larger series
    • Only exception is TLS, as I expected TLS to be merged before the v52 release
  • All commits of series where consolidated, moved together, and sometimes even squashed (init A -> ... -> fix A commits where squashed)
    • This especially applies to the CpuProfile-related commits
  • For example, the whole CPU Profiles effort is now a single commit series at the end of our patchset
  • This was by far the toughest patchset rebase we had so far
  • Beware: I am unfortunately pretty sure that I've missed minor changes of our gardenlinux branch in that rebase process. For example, some error message improvement or so, but nothing major. This comes from the nature of this complex operation I had to do here.
  • Changes I had to do against upstream to work with our stack:
    • rename pci_device_id from upstream back to bdf_device to be compatible with us
    • remove mutual TLS (mTLS) (use normal TLS)
    • Command::CompletePaused from upstream (NEW) clashes with Command::KeepAlive` (our fork)
  • I updated the flake inputs (nixpkgs, rust tooling) and merged it with the flake: init commit
  • Included backports (from upstream after v52)

The result is a shorter and more reviewable branch than
cyberus-github/gardenlinux while preserving the relevant Gardenlinux behavior
on top of the current Cloud Hypervisor base.

Ticket: https://github.com/cobaltcore-dev/cobaltcore/issues/503#issuecomment-4311454443

@phip1611 phip1611 self-assigned this Apr 30, 2026
@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch from bc2452a to 1a41fef Compare April 30, 2026 09:21
@phip1611
Copy link
Copy Markdown
Member Author

@olivereanderson please take a brief look. I grouped all your commits and brought them into consecutive order. Once cloud-hypervisor#8029 is merged - what are the implications for our fork? What is your recommendation to keep the patchset working and maintainable? What are your thoughts and ideas?

@olivereanderson
Copy link
Copy Markdown

@olivereanderson please take a brief look. I grouped all your commits and brought them into consecutive order. Once cloud-hypervisor#8029 is merged - what are the implications for our fork? What is your recommendation to keep the patchset working and maintainable? What are your thoughts and ideas?

I plan to backport cloud-hypervisor#8029 as soon as it is merged because the code is simply better.

@phip1611
Copy link
Copy Markdown
Member Author

If possible, I'd prefer to not merge (or backport) anything into gardenlinux before we finish this. But we can plan this together next week as well!

@olivereanderson
Copy link
Copy Markdown

If possible, I'd prefer to not merge (or backport) anything into gardenlinux before we finish this. But we can plan this together next week as well!

We can definitely merge this PR (v52) first. Let's discuss further next week 🙂

@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch 7 times, most recently from 768a632 to 7ddbe2c Compare May 5, 2026 06:17
@phip1611

This comment was marked as outdated.

@phip1611

This comment was marked as outdated.

@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch 5 times, most recently from 892081b to c5d2ed1 Compare May 12, 2026 08:45
@phip1611
Copy link
Copy Markdown
Member Author

All normal tests and the live migration tests are passing locally! 🥳 Pipeline is running! https://gitlab.cyberus-technology.de/cyberus/cloud/libvirt/-/merge_requests/194/pipelines

@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch 4 times, most recently from 4e67461 to deca83b Compare May 15, 2026 08:41
@phip1611 phip1611 changed the title bump patchset to v52 Bump patchset: v51 -> v52 May 15, 2026
@phip1611 phip1611 marked this pull request as ready for review May 15, 2026 12:18
@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch from deca83b to 601b1e8 Compare May 15, 2026 16:39
@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch from 505b051 to 873f3e0 Compare May 15, 2026 17:34
We include a list of non-architectural MSRS. This list will only be
used to help the CPU profile generation tool rule out MSRs that it
does not know how to handle.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We include a list of MSRS defined by KVM that may be approved by
CPU profiles and another list of those that may not be approved by
CPU profiles. These lists will later be used by the CPU profile
generation tool.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
The list of HyperV MSRs introduced here will be utilized during CPU
profile generation and also at runtime to filter them out whenever
`kvm_hyperv` is set to `false`.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We introduce functionality related to computing necessary MSR updates
in accordance with the given CPU profile.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We introduce functionality to filter out MSRs which we want to deny
guests from using.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We record the necessary MSR-based feature modifications that need to be
set in the `CpuManager` and make sure to set these MSR values upon
vCPU configuration. We also use the Vm to filter access to MSRs that
are incompatible with the chosen CPU profile.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We adapt the CPU profile generation tool to also take the MSR-based
features into account.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We regenerate the CPU profiles and include the MSR-related data.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Windows server needs the machine check architecture (MCA) CPUID bit to
be set in order to boot.

Since Windows server is a use-case we want to support we need to revert
our previous decision to disable MCA for non-host CPU profiles.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We permit these MSRs because they are expected to be available when
the CPUID 0x1.EDX[14](MCA) feature bit is set. Recall that MCA is
necessary in order to boot Windows Server which we want to support.

We also do not list the error reporting banks as forbidden any longer.
Aside: The previous implementation did not end up denying those MSRs
anyway, because KVM does not report them via KVM_GET_MSR_INDEX_LIST.
Now with MCA explicitly set, the guest will certainly expect the
presence of error reporting banks, so we make sure not to indicate
otherwise.

Recall that by default KVM reports all (32) error banks as available
and leaves all feature bits of IA32_MCG_CAP unset, hence the
information displayed to the guest in these MSRs will remain consistent
before and after a live migration in the absence of machine check
errors.

Note that as of today Cloud hypervisor does not transfer the error
reporting banks to the destination of a live migration which can indeed
lead to surprises, but on the other hand the information is likely to
be inaccurate at the point of resume anyway.

As a follow up we could try to mitigate the aforementioned problem
by checking for MCEs during live migration and marking the migration
as failed if any MCE occurred before or during the live migration.
That should however be addressed in a separate PR.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Regenerate CPU profiles in order to enable machine check architecture
(MCA) for non-host CPU profiles which is required to boot Windows
server.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
These are already displayed as not available to guests via CPUID for
non-host CPU profiles, but we forgot to forbid the corresponding MSRs.

The profiles we have generated are OK with respect to this oversight
because KVM_GET_MSR_INDEX_LIST did not report those MSRs at the time
they were generated, but it does now.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Hardware duty cycling (HDC) does not make sense in the virtualization
setting and should thus not be displayed as available to guests.

We have already disabled certain HDC aspects via CPUID 0x6 ECX[13],
but we forgot to disable the state components which is what we do
in this commit.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We have already disabled architectural LBR (last branch record) for CPU
profiles, but we forgot to disable the corresponding state components.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Hardware P-states (HWP) is already disabled for non-host CPU profiles,
but we forgot to also disable the associated state components.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We already disabled Processor Trace (PT) for CPU profiles, but forgot
to disable the associated state components.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We have already forbidden IA32_PASID, an MSR related to process
address space identifiers (PASID), but we forgot to disable the
associated state components.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Bit 56 of VM_ENTRY_HARDWARE_EXCEPTIONS in IA32_VMX_BASIC is only
set on rather recent KVM versions.

Thus whenever a CPU profile is generated on a machine with a recent
Linux kernel, the current inherit policy will lead to the CPU profile
being incompatible on deplyoments with older Linux kernels. This may
not be the intention of the person generating the CPU profile, thus
we change the policy to `Static(0)` for the time being.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
IA32_XSS (Extended Supervisor State Mask) is only reported via
KVM_GET_MSR_INDEX_LIST on rather recent kernels. This can lead to CPU
profiles that are generated on a machine with the latest Linux kernel,
not work with deployments where the hosts use a bit older kernels which
may be unintentional.

We thus decide to forbid this MSR for now, even though
CPUID 0xd.0x1.EAX[3] can inform the guest that the MSR is available.
We do not want to force the aforementioned feature bit to 0 because
it is also used to report support for XSAVES/XRSTORS.

Although not ideal, we consider denying access to IA32_XSS to be
acceptable because the 0xd CPUID leaves report all IA32_XSS related
state components to be unsupported. There is thus no reason for the
guest to be interested in using this MSR.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We have disabled LBR for non-host CPU profiles, but forgot to also do
so in the VM-Exit and VM-Entry control MSRs.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We add developer documentation on how to use the CPU profile generation
tool.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We will later use flate2 in arch/build.rs to compress CPU profile
JSON files at compile time and also later to decompress them at
runtime.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We introduce a build.rs build script in the arch crate which
automatically constructs the x86_64 CpuProfile enum with one variant
per pre-generated CPU profile.

In order to keep the binary size in check we also take the opportunity
to compress the CPU profile JSON files into the binary which then get
decompressed at runtime.

We will adapt cpu_profile.rs in the next commit to use the output
of build.rs

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
When we introduced our build script we forgot to tell `serde` to
(de-) serialize the `CpuProfile` enum in kebab-case which is a breaking
change.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch from df8700f to 8a11273 Compare May 20, 2026 08:43
@phip1611 phip1611 merged commit 9902cf3 into cyberus-technology:gardenlinux-next-v52-base May 20, 2026
17 checks passed
@phip1611 phip1611 deleted the gardenlinux-next-v52 branch May 20, 2026 08:45
@amphi amphi mentioned this pull request May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants