Skip to content

Enhance wait_for_vm_interfaces to detect and log missing interfaces#4113

Open
geetikakay wants to merge 1 commit into
RedHatQE:mainfrom
geetikakay:interface_checks
Open

Enhance wait_for_vm_interfaces to detect and log missing interfaces#4113
geetikakay wants to merge 1 commit into
RedHatQE:mainfrom
geetikakay:interface_checks

Conversation

@geetikakay
Copy link
Copy Markdown
Contributor

@geetikakay geetikakay commented Mar 9, 2026

Short description:

Previously the function compared interface counts, making it hard to tell which interface failed to come up after migration.

More details:

with this change , it's easy to identify which interface was missing. Missing interfaces can happen due to hotplug failures , missing nic maybe during migrations.

What this PR does / why we need it:
Which issue(s) this PR fixes:
Special notes for reviewer:
jira-ticket:

Summary by CodeRabbit

  • Bug Fixes
    • Improved VM interface validation to accurately match expected interface names against reported interfaces. Enhanced error handling with detailed logging of interface name discrepancies.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 9, 2026

📝 Walkthrough

Walkthrough

wait_for_vm_interfaces in utilities/virt.py now collects expected interface names from non-absent spec interfaces and enforces exact name-based matching against reported interfaces, adding TimeoutExpiredError handling that logs mismatches and returning True only on exact match.

Changes

Cohort / File(s) Summary
Interface validation refinement
utilities/virt.py
Reworked wait_for_vm_interfaces to derive expected interface names from spec (excluding absent), require every reported interface to include interfaceName, enforce exact set equality between expected and reported names, add explicit TimeoutExpiredError logging of discrepancies, and return False when matching fails or after timeout.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The description covers key sections but is incomplete. Short description and more details explain the problem and solution, but 'What this PR does / why we need it', 'Which issue(s) this PR fixes', 'Special notes for reviewer', and 'jira-ticket' sections are all empty. Complete the empty sections: populate 'What this PR does / why we need it' with explicit functionality details, fill 'Which issue(s) this PR fixes' with issue references, add reviewer notes if applicable, and provide jira-ticket URL or write 'NONE' if untracked.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: enhancing wait_for_vm_interfaces to detect and log missing interfaces, which matches the core functionality change in the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-virtualization-qe-bot-4
Copy link
Copy Markdown

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: A tracking issue is created for this PR and will be closed when the PR is merged or closed
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: Enabled categories: branch, can-be-merged, cherry-pick, has-conflicts, hold, needs-rebase, size, verified, wip

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest verify-bugs-are-open - verify-bugs-are-open
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 2 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No WIP, hold, conflict labels
  5. Verified: PR must be marked as verified (if verification is enabled)

📊 Review Process

Approvers and Reviewers

Approvers:

  • dshchedr
  • myakove
  • rnetser
  • vsibirsk

Reviewers:

  • RoniKishner
  • dshchedr
  • geetikakay
  • rnetser
  • vsibirsk
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is automatically removed on each new commit
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@utilities/virt.py`:
- Around line 148-151: The current check uses all(iface.get("interfaceName") for
iface in interfaces) which requires every reported interface to have
interfaceName; narrow this to only verify interfaceName presence for interfaces
that are expected by changing the predicate to only iterate interfaces whose
"name" is in expected_names (e.g., all(iface.get("interfaceName") for iface in
interfaces if iface.get("name") in expected_names)), keeping the reported_names
== expected_names check and using the existing variables/interfaces to locate
the logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6bf9287e-ffdc-4746-bc74-53305c9421d2

📥 Commits

Reviewing files that changed from the base of the PR and between aa242b3 and 929f98d.

📒 Files selected for processing (1)
  • utilities/virt.py

Comment thread utilities/virt.py
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.60%. Comparing base (84584c3) to head (9c9c6e3).
⚠️ Report is 147 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4113      +/-   ##
==========================================
+ Coverage   98.56%   98.60%   +0.04%     
==========================================
  Files          25       25              
  Lines        2297     2371      +74     
==========================================
+ Hits         2264     2338      +74     
  Misses         33       33              
Flag Coverage Δ
utilities 98.60% <ø> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai[bot]
coderabbitai Bot previously approved these changes Mar 9, 2026
Comment thread utilities/virt.py
Comment thread utilities/virt.py
@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #4041.

Overlapping files

utilities/virt.py

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #3847.

Overlapping files

utilities/virt.py

Comment thread utilities/virt.py Outdated
Comment thread utilities/virt.py
)
LOGGER.info(f"Wait for {vmi.name} network interfaces")
expected_names = {
iface.name for iface in vmi.instance.spec.domain.devices.interfaces if iface.get("state") != "absent"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you ever seen the different in number of interfaces between vmi.instance.spec.domain.devices.interfaces and .status.interfaces`?

wrapping to try/except and print clear error message is a good initiative, but may be it would be enough to keep old logic and print interfaces and active_interfaces

Copy link
Copy Markdown
Contributor Author

@geetikakay geetikakay Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done these tests with this PR:

Scenario Spec Interfaces Status Interfaces expected_names Result Log Output old logic
All interfaces up default, nic-1, nic-2 default(enp1s0), nic-1(enp2s0), nic-2(enp3s0) {'default','nic-1','nic-2'} works  
Hot-plug nic-3 default, nic-1, nic-2, nic-3 default(enp1s0), nic-1(enp2s0), nic-2(enp3s0), nic-3(enp4s0) {'default','nic-1','nic-2','nic-3'} works  
Hot-unplug nic-3 (state: absent) default, nic-1, nic-2, nic-3(absent) default(enp1s0), nic-1(enp2s0), nic-2(enp3s0) {'default','nic-1','nic-2'} works  
nic-test missing after migration default, nic-1, nic-2, nic-test default(enp1s0), nic-1(enp2s0), nic-2(enp3s0) {'default','nic-1','nic-2','nic-test'} timeout Missing (not reported by guest agent): {'nic-test'} silently returns True (3 == 3)
  • scenario 4 tracks it,, missing interface.. old logic would silently pass it.

  • by comparing reported_names == expected_names we now detecting when a spec interface doesn't show up in status and report

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scenario 4 is interesting, but the guest-agent output already checked by if interface.get("interfaceName")]

We have observed some flaky issue with guest-agent interfaces after migration, and it was caught by existing automation:

12:29:12  2026-03-03T19:29:12.058304 timeout_sampler INFO Waiting for 720 seconds [0:12:00], retry every 1 seconds. (Function: utilities.virt.<locals>.lambda: vmi.instance)
12:29:12  2026-03-03T19:29:12.058163 utilities.virt INFO Wait for first-vm-for-aaq-test-1772566081-4850352 network interfaces
12:41:18  FAILED

and in collected VMI status we don't have guest-agent as a infoSource and don't have "interfaceName":

  interfaces:
  - infoSource: domain
    ipAddress: 10.128.0.99
    ipAddresses:
    - 10.128.0.99
    - fd02:0:0:1::63
    linkState: up
    mac: 02:7e:eb:3c:67:42
    name: default
    podInterfaceName: eth0
    queueCount: 1

I agree - adding something meaninfgul like echo interfaces and active_interfaces to error message can be very useful, but may be the existing logic works fine and you are hunting for the same flaky issue..

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at https://github.com/kubevirt/kubevirt/blob/main/tests/libnet/hotplug.go#L51-L54 and for dynamic changes it looked for vmi.spec.
this code change introduces similar spec comparison approach as VerifyDynamicInterfaceChange which will make checks more reliable.from the testing i did this change behaves as expected in all scenarios

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against of updating logic, just trying to understand if we really need it and how safe is it.

You are updating the global function which is used in almost all tests in the repo.
However, from my understanding, we can get absent state only when we hot-unplug the interface. If we have any "absent" interface in all other regular scenarios it would sound as a bug which your changes potentially could hide.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dshchedr thanks for reviewing this, I really appreciate it. These are very valid questions and I agree we should be thoughtful about the impact since this helper is used across many scenarios. It makes sense to have clear discussion around the behavior change
I looked in code about state: absent ,based on upstream code validation check it can happen only on non default interfaces and never on default interface. admission webhooks checks and rejects for default.
there is a test coverage in that area as well .

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sharing the links I came across while digging into this concept in case they are useful for discussion and can be referred later if needed

Signed-off-by: Geetika Kapoor <gkapoor@redhat.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@utilities/virt.py`:
- Line 150: The condition using reported_names == expected_names in the
interface-checking logic (the if that also inspects interfaces and
interface.get("interfaceName")) enforces an exact-set match and will fail when
the guest reports extra interfaces; decide whether you want subset behavior and
if so replace the equality check with expected_names <= reported_names,
otherwise make the exact-match semantics explicit by adding a short comment near
reported_names and expected_names explaining that extra reported interfaces are
treated as a mismatch on purpose; update the condition or comment in the
function containing the interfaces check accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7f4d8786-6af5-4a83-be6c-71c554227058

📥 Commits

Reviewing files that changed from the base of the PR and between 929f98d and 9c9c6e3.

📒 Files selected for processing (1)
  • utilities/virt.py

Comment thread utilities/virt.py
@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #3755.

Overlapping files

utilities/virt.py

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #4159.

Overlapping files

utilities/virt.py

Comment thread utilities/virt.py
if len(active_interfaces) == len(interfaces):
return True
reported_names = set()
try:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #4404.

Overlapping files

utilities/virt.py

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #4463.

Overlapping files

utilities/virt.py

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #4567.

Overlapping files

utilities/virt.py

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #4746.

Overlapping files

utilities/virt.py

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #4756.

Overlapping files

utilities/virt.py

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #4804.

Overlapping files

utilities/virt.py

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #4788.

Overlapping files

utilities/virt.py

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #4934.

Overlapping files

utilities/virt.py

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #4916.

Overlapping files

utilities/virt.py

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #5103.

Overlapping files

utilities/virt.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants