Skip to content

openvmm/pcie: Unify type 0 and type 1 save restore state into SavedState to make it backward compatible#2603

Merged
shenw0000 merged 1 commit intomicrosoft:mainfrom
shenw0000:fix_uh_save_restore
Jan 8, 2026
Merged

openvmm/pcie: Unify type 0 and type 1 save restore state into SavedState to make it backward compatible#2603
shenw0000 merged 1 commit intomicrosoft:mainfrom
shenw0000:fix_uh_save_restore

Conversation

@shenw0000
Copy link
Copy Markdown
Contributor

@shenw0000 shenw0000 commented Dec 22, 2025

Previous change in the PCIe type0/type1 config space emulator broken backward compatibility of the save and restore. This change updates the save and restore implementation to be backward compatible.

Tested to verify that the change fixed the problem:

1.6->1.7 servicing repros the problem:

  81.885092300s ERROR paravisor_log:  inner_target="vmbus_relay_intercept_device" "Failed to teardown gpadl -- leaking memory." fields={"error": ["channel closed"]} extra={"timestamp": "4.007314600s"}
  81.885501800s ERROR guest_emulation_device:  guest reported vtl0 failed to start error=guest reported VTL0 start error: failed to restore: restore failed: {440bx-host-pci-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pci-isa-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pm: failed to decode protobuf: decoding failed: missing required field}
  81.887201900s  INFO paravisor_log:  inner_target="mana_driver::gdma_driver" "dropping gdma driver" fields={"self.hwc_failure": "false", "self.state_saved": "false"} extra={"timestamp": "4.022594900s"}
  81.981364100s ERROR openvmm_entry:  vtl2 servicing failed error=vtl0 start failed: guest reported VTL0 start error: failed to restore: restore failed: {440bx-host-pci-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pci-isa-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pm: failed to decode protobuf: decoding failed: missing required field}

1.6 -> fixed version resolves the problem:

 828.988495000s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: syncing command register - intx_disable=false, mmio_enabled=false" fields={} extra={"timestamp": "6.316890400s"}
 829.007081300s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: updating intx_disable=false" fields={} extra={"timestamp": "6.336884200s"}
 829.018959200s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: updating mmio_enabled=false" fields={} extra={"timestamp": "6.348810600s"}
 829.035025800s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"activity_id": "5a89a430-0001-0029-0e00-000040000000", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "name": "device_state_change", "time_active_ns": 44800, "timestamp": "6.364453700s", "time_taken_ns": 550516500, "op_code": 2}
 829.036223000s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"time_active_ns": 50700, "time_taken_ns": 551889200, "name": "device_state_change", "activity_id": "5a8b3818-0001-0029-0f00-000040000000", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "op_code": 2, "timestamp": "6.365929800s"}
 829.036739300s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"time_active_ns": 28200, "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "name": "device_state_change", "timestamp": "6.366807000s", "time_taken_ns": 552654100, "op_code": 2, "activity_id": "5a8ceec4-0001-0029-1000-000040000000"}
 829.037609200s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"timestamp": "6.367466400s", "time_active_ns": 31500, "op_code": 2, "activity_id": "5abe34d4-0001-0029-0b00-000040000800", "name": "device_state_change", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "time_taken_ns": 550084300}
 829.038569900s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "activity_id": "5ac36300-0001-0029-0a00-000040000800", "time_taken_ns": 550401100, "timestamp": "6.368122700s", "name": "device_state_change", "time_active_ns": 31900, "op_code": 2}
 [...]
 829.199631600s  INFO guest_emulation_device:  restore vtl2 complete success=true
 829.200333400s  INFO guest_emulation_device:  guest reported vtl0 started successfully
 829.308077400s  INFO openvmm_entry:  vtl2 servicing complete duration=8937

@shenw0000 shenw0000 requested a review from a team as a code owner December 22, 2025 04:17
Copilot AI review requested due to automatic review settings December 22, 2025 04:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR unifies the save/restore state structures for PCIe Type 0 and Type 1 configuration space emulators to restore backward compatibility that was broken in a previous change. The unified SavedState struct now contains both common fields (indices 1-5) and Type 1-specific fields (indices 6-15), allowing old save states (with only common fields) to be restored by defaulting Type 1 fields to 0.

Key changes:

  • Unified SavedState struct replaces separate SavedType0State and SavedType1State structures
  • Type 0 emulators save Type 1 fields as zeros and ignore them during restore
  • Type 1 emulators now directly serialize all fields in the unified format, with proper BAR padding

Comment on lines +1530 to +1536
// Pad base_addresses to 6 elements for common header (Type 1 uses 2 BARs)
let mut full_base_addresses = [0u32; 6];
for (i, &addr) in base_addresses.iter().enumerate().take(2) {
full_base_addresses[i] = addr;
}
self.common
.set_base_addresses(&[full_base_addresses[0], full_base_addresses[1]]);
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for extracting only the first 2 BARs from base_addresses is unnecessarily complex. The take(2) already limits the iteration to the first 2 elements, making the manual array initialization and indexing redundant. This could be simplified to directly copy the first 2 elements.

Suggested change
// Pad base_addresses to 6 elements for common header (Type 1 uses 2 BARs)
let mut full_base_addresses = [0u32; 6];
for (i, &addr) in base_addresses.iter().enumerate().take(2) {
full_base_addresses[i] = addr;
}
self.common
.set_base_addresses(&[full_base_addresses[0], full_base_addresses[1]]);
// Use only the first two BARs (Type 1 uses 2 BARs), zero-filling if missing.
let mut bar_addrs = [0u32; 2];
for (dst, src) in bar_addrs.iter_mut().zip(base_addresses.iter()) {
*dst = *src;
}
self.common.set_base_addresses(&bar_addrs);

Copilot uses AI. Check for mistakes.
Comment on lines +2098 to +2116
emu.write_u32(0x3C, 0x0040_0000).unwrap(); // Set latency_timer

// Save the state
let saved_state = common_emu.save().expect("save should succeed");
let saved_state = emu.save().expect("save should succeed");

// Reset the emulator
common_emu.reset();
let result = common_emu.read_u32(0x04, &mut test_val);
assert_eq!(result, CommonHeaderResult::Handled);
emu.reset();

// Verify state is reset
emu.read_u32(0x04, &mut test_val).unwrap();
assert_eq!(test_val & 0x0007, 0x0000); // Should be reset

// Restore the state
common_emu
.restore(saved_state)
.expect("restore should succeed");
let result = common_emu.read_u32(0x04, &mut test_val);
assert_eq!(result, CommonHeaderResult::Handled);
assert_eq!(test_val & 0x0007, 0x0007); // Should be restored
emu.restore(saved_state).expect("restore should succeed");

// Test Type 1 common header emulator save/restore
let hardware_ids = HardwareIds {
vendor_id: 0x3333,
device_id: 0x4444,
revision_id: 1,
prog_if: ProgrammingInterface::NONE,
sub_class: Subclass::BRIDGE_PCI_TO_PCI,
base_class: ClassCode::BRIDGE,
type0_sub_vendor_id: 0,
type0_sub_system_id: 0,
};
// Verify state is restored
emu.read_u32(0x04, &mut test_val).unwrap();
assert_eq!(test_val & 0x0007, 0x0007); // Should be restored
}
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test writes to the latency_timer field via register 0x3C at line 2098 but never verifies that this value is properly restored after the restore operation. Consider adding a verification step after line 2115 to read back register 0x3C and confirm the latency_timer value was correctly restored.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown

@smmalis37
Copy link
Copy Markdown
Contributor

@tjones60 Can we combine this PR with your work so that we can enable a gen 1 servicing test as soon as this fixes it?

Copy link
Copy Markdown
Collaborator

@mebersol mebersol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@shenw0000 shenw0000 merged commit 3bb918d into microsoft:main Jan 8, 2026
109 of 111 checks passed
shenw0000 added a commit to shenw0000/openvmm that referenced this pull request Jan 8, 2026
…ate to make it backward compatible (microsoft#2603)

Previous change in the PCIe type0/type1 config space emulator broken
backward compatibility of the save and restore. This change updates the
save and restore implementation to be backward compatible.

Tested to verify that the change fixed the problem:
## 1.6->1.7 servicing repros the problem:
```
  81.885092300s ERROR paravisor_log:  inner_target="vmbus_relay_intercept_device" "Failed to teardown gpadl -- leaking memory." fields={"error": ["channel closed"]} extra={"timestamp": "4.007314600s"}
  81.885501800s ERROR guest_emulation_device:  guest reported vtl0 failed to start error=guest reported VTL0 start error: failed to restore: restore failed: {440bx-host-pci-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pci-isa-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pm: failed to decode protobuf: decoding failed: missing required field}
  81.887201900s  INFO paravisor_log:  inner_target="mana_driver::gdma_driver" "dropping gdma driver" fields={"self.hwc_failure": "false", "self.state_saved": "false"} extra={"timestamp": "4.022594900s"}
  81.981364100s ERROR openvmm_entry:  vtl2 servicing failed error=vtl0 start failed: guest reported VTL0 start error: failed to restore: restore failed: {440bx-host-pci-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pci-isa-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pm: failed to decode protobuf: decoding failed: missing required field}
```

## 1.6 -> fixed version resolves the problem:
```
 828.988495000s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: syncing command register - intx_disable=false, mmio_enabled=false" fields={} extra={"timestamp": "6.316890400s"}
 829.007081300s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: updating intx_disable=false" fields={} extra={"timestamp": "6.336884200s"}
 829.018959200s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: updating mmio_enabled=false" fields={} extra={"timestamp": "6.348810600s"}
 829.035025800s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"activity_id": "5a89a430-0001-0029-0e00-000040000000", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "name": "device_state_change", "time_active_ns": 44800, "timestamp": "6.364453700s", "time_taken_ns": 550516500, "op_code": 2}
 829.036223000s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"time_active_ns": 50700, "time_taken_ns": 551889200, "name": "device_state_change", "activity_id": "5a8b3818-0001-0029-0f00-000040000000", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "op_code": 2, "timestamp": "6.365929800s"}
 829.036739300s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"time_active_ns": 28200, "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "name": "device_state_change", "timestamp": "6.366807000s", "time_taken_ns": 552654100, "op_code": 2, "activity_id": "5a8ceec4-0001-0029-1000-000040000000"}
 829.037609200s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"timestamp": "6.367466400s", "time_active_ns": 31500, "op_code": 2, "activity_id": "5abe34d4-0001-0029-0b00-000040000800", "name": "device_state_change", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "time_taken_ns": 550084300}
 829.038569900s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "activity_id": "5ac36300-0001-0029-0a00-000040000800", "time_taken_ns": 550401100, "timestamp": "6.368122700s", "name": "device_state_change", "time_active_ns": 31900, "op_code": 2}
 [...]
 829.199631600s  INFO guest_emulation_device:  restore vtl2 complete success=true
 829.200333400s  INFO guest_emulation_device:  guest reported vtl0 started successfully
 829.308077400s  INFO openvmm_entry:  vtl2 servicing complete duration=8937
```
shenw0000 added a commit that referenced this pull request Jan 8, 2026
…ate to make it backward compatible (#2603) (#2623)

Previous change in the PCIe type0/type1 config space emulator broken
backward compatibility of the save and restore. This change updates the
save and restore implementation to be backward compatible.

Tested to verify that the change fixed the problem: ## 1.6->1.7
servicing repros the problem:
```
  81.885092300s ERROR paravisor_log:  inner_target="vmbus_relay_intercept_device" "Failed to teardown gpadl -- leaking memory." fields={"error": ["channel closed"]} extra={"timestamp": "4.007314600s"}
  81.885501800s ERROR guest_emulation_device:  guest reported vtl0 failed to start error=guest reported VTL0 start error: failed to restore: restore failed: {440bx-host-pci-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pci-isa-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pm: failed to decode protobuf: decoding failed: missing required field}
  81.887201900s  INFO paravisor_log:  inner_target="mana_driver::gdma_driver" "dropping gdma driver" fields={"self.hwc_failure": "false", "self.state_saved": "false"} extra={"timestamp": "4.022594900s"}
  81.981364100s ERROR openvmm_entry:  vtl2 servicing failed error=vtl0 start failed: guest reported VTL0 start error: failed to restore: restore failed: {440bx-host-pci-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pci-isa-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pm: failed to decode protobuf: decoding failed: missing required field}
```

## 1.6 -> fixed version resolves the problem:
```
 828.988495000s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: syncing command register - intx_disable=false, mmio_enabled=false" fields={} extra={"timestamp": "6.316890400s"}
 829.007081300s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: updating intx_disable=false" fields={} extra={"timestamp": "6.336884200s"}
 829.018959200s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: updating mmio_enabled=false" fields={} extra={"timestamp": "6.348810600s"}
 829.035025800s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"activity_id": "5a89a430-0001-0029-0e00-000040000000", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "name": "device_state_change", "time_active_ns": 44800, "timestamp": "6.364453700s", "time_taken_ns": 550516500, "op_code": 2}
 829.036223000s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"time_active_ns": 50700, "time_taken_ns": 551889200, "name": "device_state_change", "activity_id": "5a8b3818-0001-0029-0f00-000040000000", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "op_code": 2, "timestamp": "6.365929800s"}
 829.036739300s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"time_active_ns": 28200, "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "name": "device_state_change", "timestamp": "6.366807000s", "time_taken_ns": 552654100, "op_code": 2, "activity_id": "5a8ceec4-0001-0029-1000-000040000000"}
 829.037609200s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"timestamp": "6.367466400s", "time_active_ns": 31500, "op_code": 2, "activity_id": "5abe34d4-0001-0029-0b00-000040000800", "name": "device_state_change", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "time_taken_ns": 550084300}
 829.038569900s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "activity_id": "5ac36300-0001-0029-0a00-000040000800", "time_taken_ns": 550401100, "timestamp": "6.368122700s", "name": "device_state_change", "time_active_ns": 31900, "op_code": 2}
 [...]
 829.199631600s  INFO guest_emulation_device:  restore vtl2 complete success=true
 829.200333400s  INFO guest_emulation_device:  guest reported vtl0 started successfully
 829.308077400s  INFO openvmm_entry:  vtl2 servicing complete duration=8937
```
@benhillis
Copy link
Copy Markdown
Member

Backported to release/1.7.2511 in #2623

@benhillis benhillis added backported_1.7.2511 PR that has been backported to release/1.7.2511 and removed backport_1.7.2511 labels Jan 9, 2026
benhillis pushed a commit to benhillis/openvmm that referenced this pull request Jan 29, 2026
* release/1.7.2511:
  openvmm/pcie: Unify type 0 and type 1 save restore state into SavedState to make it backward compatible (microsoft#2603) (microsoft#2623)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backported_1.7.2511 PR that has been backported to release/1.7.2511

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants