Skip to content

net, stuntime, std: VM stuntime measurement during live migration#4056

Open
Anatw wants to merge 1 commit intoRedHatQE:mainfrom
Anatw:stuntime_std
Open

net, stuntime, std: VM stuntime measurement during live migration#4056
Anatw wants to merge 1 commit intoRedHatQE:mainfrom
Anatw:stuntime_std

Conversation

@Anatw
Copy link
Contributor

@Anatw Anatw commented Mar 3, 2026

Short description:

Introduce the STD for VM stuntime measurement during live migration on Linux bridge and OVN localnet secondary networks. These tests provide baseline measurements and a framework for regression detection.

Special notes for reviewer:

Baseline and Threshold:

  • Per-scenario threshold - setting the threshold to the minimum of (max_observed × 4) and 5 seconds — i.e. 4× the worst observed value for that scenario, capped at 5 seconds.
  • Thresholds will be hardcoded based on 10-run baselines per-scenario, on a BM cluster.
  • A per-scenario approach is used instead of a global threshold to prevent slow-path scenarios from masking regressions in faster ones.

Measurement Methodology:

  • Ping command: ICMP ping at 100ms intervals with UNIX timestamps (ping -D -O -i 0.1).
  • Stuntime calculation: Stuntime is the largest gap between any two consecutive successful replies in the ping log - i.e., the maximum time difference between timestamps of successive successful packets. The boundaries (last success before loss, first success after recovery) are the pair of packets that define that gap. Stuntime = (Timestamp of first success after) - (Timestamp of last success before)
  • Alternatives rejected: tcping (introducing a new dependency), iperf3 (unnecessary complexity), and curl (unnecessary complexity - requires an active web server inside the VM).

Both VMs start on the same node to align with the first scenario logic: migrating from the same node to a different node (static_to_different).

Assisted by: Cursor

jira-ticket:

https://issues.redhat.com/browse/CNV-78677

Summary by CodeRabbit

  • Tests
    • Added test infrastructure to measure stuntime during VM live migration on secondary networks, supporting Linux Bridge and OVN LocalNet configurations.

Introduce the STD for VM stuntime measurement during live migration
on Linux bridge and OVN localnet secondary networks. These tests
provide baseline measurements and a framework for regression detection.

Baseline and Threshold:
- Per-scenario threshold - setting the threshold to the minimum of
  (max_observed × 4) and 5 seconds — i.e. 4× the worst observed value
  for that scenario, capped at 5 seconds.
- Thresholds will be hardcoded based on 10-run baselines per-scenario,
  on a BM cluster.
- A per-scenario approach is used instead of a global threshold to
  prevent slow-path scenarios from masking regressions in faster ones.

Measurement Methodology:
- Ping command: ICMP ping at 100ms intervals with UNIX timestamps
  (`ping -D -O -i 0.1`).
- Stuntime calculation:  Stuntime is the largest gap between any two
  consecutive successful replies in the ping log - i.e., the maximum
  time difference between timestamps of successive successful packets.
  The boundaries (last success before loss, first success after
  recovery) are the pair of packets that define that gap.
  Stuntime = (Timestamp of first success after) - (Timestamp of last
  success before)
- Alternatives rejected: tcping (introducing a new dependency), iperf3
  (unnecessary complexity), and curl (unnecessary complexity - requires
  an active web server inside the VM).

Both VMs start on the same node to align with the first scenario logic:
migrating from the same node to a different node (static_to_different).

Signed-off-by: Anat Wax <awax@redhat.com>
Assisted by: Cursor
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 3, 2026

📝 Walkthrough

Walkthrough

A new test module is introduced with two test classes (TestStuntimeLinuxBridge and TestStuntimeOvnLocalnet) designed to measure stuntime during VM live migration on secondary networks. Both classes contain disabled test methods with docstring specifications and no executable test logic.

Changes

Cohort / File(s) Summary
Stuntime Measurement Test Module
tests/network/stuntime/test_stuntime_measurement.py
Adds two test classes with test_stuntime methods that are disabled from pytest collection via __test__ = False attribute assignment. Methods contain only docstrings describing test parameters, prerequisites, steps, and expected outcomes with no executable logic.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: introducing a standard test (STD) for measuring VM stuntime during live migration on secondary networks.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed PR description is comprehensive and follows the template with all required sections populated, including short description, special notes for reviewer, and Jira ticket reference.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-virtualization-qe-bot-3
Copy link
Contributor

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: A tracking issue is created for this PR and will be closed when the PR is merged or closed
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: Enabled categories: branch, can-be-merged, cherry-pick, has-conflicts, hold, needs-rebase, size, verified, wip

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest verify-bugs-are-open - verify-bugs-are-open
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 2 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No WIP, hold, conflict labels
  5. Verified: PR must be marked as verified (if verification is enabled)

📊 Review Process

Approvers and Reviewers

Approvers:

  • EdDev

Reviewers:

  • Anatw
  • EdDev
  • azhivovk
  • servolkov
  • yossisegev
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is automatically removed on each new commit
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/network/stuntime/test_stuntime_measurement.py`:
- Around line 14-75: Add the suggested tier classification by marking the
stuntime tests with a tier3 pytest marker: update the
TestStuntimeLinuxBridge.test_stuntime and TestStuntimeOvnLocalnet.test_stuntime
(or their containing classes) to include `@pytest.mark.tier3` in addition to
existing markers so the live-migration/timing-heavy tests can be opt-in; ensure
the marker name is exactly tier3 to match repository conventions and keep the
existing polarion and other markers intact.
- Around line 49-75: The test function test_stuntime needs three edits: add the
`@pytest.mark.ipv4` decorator above def test_stuntime to match the docstring
marker; change the unicode multiplication sign "×" in the Expected threshold
description to a plain "x" (so use "min(max x 4, 5s)" wording); and remove the
test_stuntime.__test__ = False line and instead annotate the function with
`@pytest.mark.skip`(reason="placeholder/test disabled - add valid Polarion ID to
enable") to skip until a real Polarion ID is provided; also replace the
placeholder Polarion ID "CNV-00000" with the correct ID before removing the
skip.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a945538 and 6caeefa.

📒 Files selected for processing (1)
  • tests/network/stuntime/test_stuntime_measurement.py

@openshift-virtualization-qe-bot

D/S test tox -e verify-tc-requirement-polarion failed: cnv-tests-tox-executor/26036

- ipv4

Parametrize:
- migration_path: [static_to_different, different_to_static, between_different]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I wasn't already familiar with the STP, I wouldn't understand what is said here.
Please add a short explanation so the path will be clear. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants