Bump patchset: v51 -> v52#158
Closed
phip1611 wants to merge 178 commits into
Closed
Conversation
This reverts commit ced3762. This change lead to a serious memory regression when not using hugepages or shared=on. `MAP_PRIVATE` creates an anonymous memory allocation for every page written when the backing store is a file. This CoW behaviour is useful but leads to double allocations when the backing store is an empty file created by `memfd_create()`. When the page is written to, the CoW semantics require a real page to be created in the memory for the memfd (previously before the page was touched they would all point to the zero page). This real page is filled with zeroes because in theory this page would be accessible via read/write syscalls on the FD even though in our implementation it is only ever `mmap()`ed. The intention of the commit was to enable `fallocate()` to be used to punch holes but that would only affect the inaccessible backing page and the page in the CoW anonymous memory would be unaffected. Leading it likely not to have the desired effect. Fixes: cloud-hypervisor#8211 Signed-off-by: Rob Bradford <rbradford@meta.com>
Reordering commands or adding commands in-between is breaking the migration protocol. By using explicit numbers, we can increase the attention required when touching this code. On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
This increases debugability. On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
Storing the snapshot causes issues when needing to do a subsequent hotplug instead just pass it through on all the methods that need it making the lifecycle cleaner. Assisted-by: Claude:Opus-4.6 Signed-off-by: Rob Bradford <rbradford@meta.com>
ReadVolatile already provides a default read_volatile_exact() implementation, and WriteVolatile a default write_volatile_exact() implementation. Overriding these functions adds no behavioral value, but duplicates logic and needs to be updated whenever SocketStream gains or changes a variant. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
The trait is not used and thus can be removed. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
TLS connections have a TLS server (listens for incoming connections) and a TLS client (initiates the connection). This commit adds the code for the client side, which is the sender of a migration On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
Code for the TLS server, i.e. the receiver of a live migration. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
Teach the migration transport to handle TLS-backed streams alongside plain TCP and UNIX sockets. Introduce a Tls variant in SocketStream and implement the necessary traits. Also updates the local-migration error path to reject any non-UNIX transport, which now includes TLS-wrapped TCP connections. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
Extend ReceiveListener with a TLS-backed listener variant for migration receivers. Store the TCP listener together with the server TLS configuration, wrap accepted sockets in TlsStream::new_server(), and preserver the existing listener cloning and fd polling behavior so receive-side migration code can treat TLS listeners like the existing TCP and UNIX cases. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
For TLS we have to parse the hostname from the given migration URL. For that we have to make a few assumptions about the URL (e.g. it always has a port). To catch problems early, we tighten the URL validation. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
To enable TLS, the caller has to provide a path to a directory that contains the necessary files. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
As we now have more than one parameter for the receive migration call, this commit also adds parsing and validation for those parameters. We maintain backwards compatibility by also correctly parsing the case where the caller only provides a URL. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
Wire in the code paths that activate the TLS encrypting if the necessary API arguments are provided. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com
On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
To check gitlint locally, one can run: gitlint --commits "HEAD~2..HEAD" which for example checks the last two commits. Although this is just our kinda private (but public) fork, people might cherry-pick commits from us for whatever reason. So we should have proper commit style. On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
Remove irrelevant/annoying CI here to accelerate development. Further, we don't have the runners to run the integration tests, but at least we want to run the unit tests, clippy, etc. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com
Adds a flake configuration that enables building Cloud Hypervisor directly from this repository using Nix. This makes it possible to deploy and test Cloud Hypervisor on NixOS systems in real environments. On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
Sets the `CH_EXTRA_VERSION` env var during compilation to add the git revision to the version output. On-behalf-of: SAP julian.schindel@sap.com Signed-off-by: Julian Schindel <julian.schindel@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
TL;DR: Fix for long rebuilds locally when testing things. The release profile is optimized for maximum performance, sacrificing build speed. As local development and testing requires frequent rebuilds, but the dev profile is way too slow for "real testing", this profile is a sweet spot and helps to investigate things. Instead of `cargo run --release`, one can now run `cargo run --profile optimized-dev`. # Measurements Measurements were done using `$ [cargo clean;] time cargo build --profile release|optimized-dev` and rustc 1.89. I've used the `time`-builtin from zsh. Note that user time is much higher as we have more threads (codegen units) now. The total time is much shorter, tho. ## Clean Build Speedup of 56%. - `$ time cargo clean --release`: `109,67s user 13,64s system 211% cpu 58,343 total` - `$ time cargo clean --profile optimized-dev`: `185,41s user 14,92s system 528% cpu 37,876 total` ## Incremental Build Speedup of 153%. - `$ time cargo clean --release`: `37,58s user 1,53s system 117% cpu 33,356 total` - `$ time cargo clean --profile optimized-dev`: `47,62s user 1,71s system 373% cpu 13,220 total` Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com
With debug symbols, we will get better backtraces and can improve our experience debugging. The only downside is larger binary size which is negligible in our case. There are no implications for the performance. Stripped: 3.9M Unstripped: 4.7M Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com
This improves the quality of the logs when debugging issues. I've used the `jiff` time library as it is well-known time library of the ecosystem. Now, the first logging message (level info!) looks somewhat like this: ```text Cloud Hypervisor starting: build version: v51.1-203-g7f0f1f5cb-dirty, date: 2026-03-30T14:42:30.00730185+02:00 ``` On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
We've seen big VMs under massive load that regularly fired their
`"vCPU thread did not respond in {count}ms to signal - retrying` warning
message. So far, all such situations recovered themselves after ~600ms.
To be more fail-safe for the production environment under load, we
increase this timeout to 10s.
On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
This is very helpful information in the field where libvirt just sets the info level. This retains the behavior that we already have at our customer. Further, our tests rely on that line to check the migration progresses. On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
This is a temporary measurement as upstream decided for a different name than we in our fork. On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
This includes the timezone again. On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
This allows to attach FDs provided by the management layer to virtio-net devices on the live-migration receiver side. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com
This list will be used to help us detect unknown MSRs when generating CPU profiles. It serves no other purpose beyond that. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
TODO: Squash into previous commit if this all works as expected Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We include a list of non-architectural MSRS. This list will only be used to help the CPU profile generation tool rule out MSRs that it does not know how to handle. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We include a list of MSRS defined by KVM that may be approved by CPU profiles and another list of those that may not be approved by CPU profiles. These lists will later be used by the CPU profile generation tool. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
The list of HyperV MSRs introduced here will be utilized during CPU profile generation and also at runtime to filter them out whenever `kvm_hyperv` is set to `false`. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We introduce functionality related to computing necessary MSR updates in accordance with the given CPU profile. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We introduce functionality to filter out MSRs which we want to deny guests from using. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We record the necessary MSR-based feature modifications that need to be set in the `CpuManager` and make sure to set these MSR values upon vCPU configuration. We also use the Vm to filter access to MSRs that are incompatible with the chosen CPU profile. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We adapt the CPU profile generation tool to also take the MSR-based features into account. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We regenerate the CPU profiles and include the MSR-related data. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
Windows server needs the machine check architecture (MCA) CPUID bit to be set in order to boot. Since Windows server is a use-case we want to support we need to revert our previous decision to disable MCA for non-host CPU profiles. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We permit these MSRs because they are expected to be available when the CPUID 0x1.EDX[14](MCA) feature bit is set. Recall that MCA is necessary in order to boot Windows Server which we want to support. We also do not list the error reporting banks as forbidden any longer. Aside: The previous implementation did not end up denying those MSRs anyway, because KVM does not report them via KVM_GET_MSR_INDEX_LIST. Now with MCA explicitly set, the guest will certainly expect the presence of error reporting banks, so we make sure not to indicate otherwise. Recall that by default KVM reports all (32) error banks as available and leaves all feature bits of IA32_MCG_CAP unset, hence the information displayed to the guest in these MSRs will remain consistent before and after a live migration in the absence of machine check errors. Note that as of today Cloud hypervisor does not transfer the error reporting banks to the destination of a live migration which can indeed lead to surprises, but on the other hand the information is likely to be inaccurate at the point of resume anyway. As a follow up we could try to mitigate the aforementioned problem by checking for MCEs during live migration and marking the migration as failed if any MCE occurred before or during the live migration. That should however be addressed in a separate PR. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
Regenerate CPU profiles in order to enable machine check architecture (MCA) for non-host CPU profiles which is required to boot Windows server. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
These are already displayed as not available to guests via CPUID for non-host CPU profiles, but we forgot to forbid the corresponding MSRs. The profiles we have generated are OK with respect to this oversight because KVM_GET_MSR_INDEX_LIST did not report those MSRs at the time they were generated, but it does now. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
Hardware duty cycling (HDC) does not make sense in the virtualization setting and should thus not be displayed as available to guests. We have already disabled certain HDC aspects via CPUID 0x6 ECX[13], but we forgot to disable the state components which is what we do in this commit. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We have already disabled architectural LBR (last branch record) for CPU profiles, but we forgot to disable the corresponding state components. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
Hardware P-states (HWP) is already disabled for non-host CPU profiles, but we forgot to also disable the associated state components. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We already disabled Processor Trace (PT) for CPU profiles, but forgot to disable the associated state components. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We have already forbidden IA32_PASID, an MSR related to process address space identifiers (PASID), but we forgot to disable the associated state components. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
Bit 56 of VM_ENTRY_HARDWARE_EXCEPTIONS in IA32_VMX_BASIC is only set on rather recent KVM versions. Thus whenever a CPU profile is generated on a machine with a recent Linux kernel, the current inherit policy will lead to the CPU profile being incompatible on deplyoments with older Linux kernels. This may not be the intention of the person generating the CPU profile, thus we change the policy to `Static(0)` for the time being. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
IA32_XSS (Extended Supervisor State Mask) is only reported via KVM_GET_MSR_INDEX_LIST on rather recent kernels. This can lead to CPU profiles that are generated on a machine with the latest Linux kernel, not work with deployments where the hosts use a bit older kernels which may be unintentional. We thus decide to forbid this MSR for now, even though CPUID 0xd.0x1.EAX[3] can inform the guest that the MSR is available. We do not want to force the aforementioned feature bit to 0 because it is also used to report support for XSAVES/XRSTORS. Although not ideal, we consider denying access to IA32_XSS to be acceptable because the 0xd CPUID leaves report all IA32_XSS related state components to be unsupported. There is thus no reason for the guest to be interested in using this MSR. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We have disabled LBR for non-host CPU profiles, but forgot to also do so in the VM-Exit and VM-Entry control MSRs. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We add developer documentation on how to use the CPU profile generation tool. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We will later use flate2 in arch/build.rs to compress CPU profile JSON files at compile time and also later to decompress them at runtime. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
We introduce a build.rs build script in the arch crate which automatically constructs the x86_64 CpuProfile enum with one variant per pre-generated CPU profile. In order to keep the binary size in check we also take the opportunity to compress the CPU profile JSON files into the binary which then get decompressed at runtime. We will adapt cpu_profile.rs in the next commit to use the output of build.rs Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
When we introduced our build script we forgot to tell `serde` to (de-) serialize the `CpuProfile` enum in kebab-case which is a breaking change. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This series bumps the gardenlinux Cloud Hypervisor patchset onto the recently released v52.
You can find an overview of the difficulties during the rebase in this outline document (trivial patches, hard to rebase patches, patches that are now upstream...).
From ~250 commits we have in the current
gardenlinuxbranch, we are now down to ~170.Changes & Hints for Reviewers
init A -> ... -> fix Acommits where squashed)pci_device_idfrom upstream back tobdf_deviceto be compatible with usCommand::CompletePausedfrom upstream (NEW) clashes with Command::KeepAlive` (our fork)flake: initcommit0ab89ac(#8254)The result is a shorter and more reviewable branch than
cyberus-github/gardenlinuxwhile preserving the relevant Gardenlinux behavioron top of the current Cloud Hypervisor base.
Ticket: https://github.com/cobaltcore-dev/cobaltcore/issues/503#issuecomment-4311454443