Enhance wait_for_vm_interfaces to detect and log missing interfaces#4113
Enhance wait_for_vm_interfaces to detect and log missing interfaces#4113geetikakay wants to merge 1 commit into
Conversation
📝 WalkthroughWalkthroughwait_for_vm_interfaces in utilities/virt.py now collects expected interface names from non-absent spec interfaces and enforces exact name-based matching against reported interfaces, adding TimeoutExpiredError handling that logs mismatches and returning True only on exact match. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Report bugs in Issues Welcome! 🎉This pull request will be automatically processed with the following features: 🔄 Automatic Actions
📋 Available CommandsPR Status Management
Review & Approval
Testing & Validation
Container Operations
Cherry-pick Operations
Label Management
✅ Merge RequirementsThis PR will be automatically approved when the following conditions are met:
📊 Review ProcessApprovers and ReviewersApprovers:
Reviewers:
Available Labels
💡 Tips
For more information, please refer to the project documentation or contact the maintainers. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@utilities/virt.py`:
- Around line 148-151: The current check uses all(iface.get("interfaceName") for
iface in interfaces) which requires every reported interface to have
interfaceName; narrow this to only verify interfaceName presence for interfaces
that are expected by changing the predicate to only iterate interfaces whose
"name" is in expected_names (e.g., all(iface.get("interfaceName") for iface in
interfaces if iface.get("name") in expected_names)), keeping the reported_names
== expected_names check and using the existing variables/interfaces to locate
the logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6bf9287e-ffdc-4746-bc74-53305c9421d2
📒 Files selected for processing (1)
utilities/virt.py
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4113 +/- ##
==========================================
+ Coverage 98.56% 98.60% +0.04%
==========================================
Files 25 25
Lines 2297 2371 +74
==========================================
+ Hits 2264 2338 +74
Misses 33 33
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #4041. Overlapping filesutilities/virt.py |
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #3847. Overlapping filesutilities/virt.py |
| ) | ||
| LOGGER.info(f"Wait for {vmi.name} network interfaces") | ||
| expected_names = { | ||
| iface.name for iface in vmi.instance.spec.domain.devices.interfaces if iface.get("state") != "absent" |
There was a problem hiding this comment.
have you ever seen the different in number of interfaces between vmi.instance.spec.domain.devices.interfaces and .status.interfaces`?
wrapping to try/except and print clear error message is a good initiative, but may be it would be enough to keep old logic and print interfaces and active_interfaces
There was a problem hiding this comment.
I have done these tests with this PR:
| Scenario | Spec Interfaces | Status Interfaces | expected_names | Result | Log Output | old logic |
|---|---|---|---|---|---|---|
| All interfaces up | default, nic-1, nic-2 | default(enp1s0), nic-1(enp2s0), nic-2(enp3s0) | {'default','nic-1','nic-2'} | works | ||
| Hot-plug nic-3 | default, nic-1, nic-2, nic-3 | default(enp1s0), nic-1(enp2s0), nic-2(enp3s0), nic-3(enp4s0) | {'default','nic-1','nic-2','nic-3'} | works | ||
| Hot-unplug nic-3 (state: absent) | default, nic-1, nic-2, nic-3(absent) | default(enp1s0), nic-1(enp2s0), nic-2(enp3s0) | {'default','nic-1','nic-2'} | works | ||
| nic-test missing after migration | default, nic-1, nic-2, nic-test | default(enp1s0), nic-1(enp2s0), nic-2(enp3s0) | {'default','nic-1','nic-2','nic-test'} | timeout | Missing (not reported by guest agent): {'nic-test'} | silently returns True (3 == 3) |
-
scenario 4 tracks it,, missing interface.. old logic would silently pass it.
-
by comparing reported_names == expected_names we now detecting when a spec interface doesn't show up in status and report
There was a problem hiding this comment.
scenario 4 is interesting, but the guest-agent output already checked by if interface.get("interfaceName")]
We have observed some flaky issue with guest-agent interfaces after migration, and it was caught by existing automation:
12:29:12 2026-03-03T19:29:12.058304 timeout_sampler INFO Waiting for 720 seconds [0:12:00], retry every 1 seconds. (Function: utilities.virt.<locals>.lambda: vmi.instance)
12:29:12 2026-03-03T19:29:12.058163 utilities.virt INFO Wait for first-vm-for-aaq-test-1772566081-4850352 network interfaces
12:41:18 FAILED
and in collected VMI status we don't have guest-agent as a infoSource and don't have "interfaceName":
interfaces:
- infoSource: domain
ipAddress: 10.128.0.99
ipAddresses:
- 10.128.0.99
- fd02:0:0:1::63
linkState: up
mac: 02:7e:eb:3c:67:42
name: default
podInterfaceName: eth0
queueCount: 1
I agree - adding something meaninfgul like echo interfaces and active_interfaces to error message can be very useful, but may be the existing logic works fine and you are hunting for the same flaky issue..
There was a problem hiding this comment.
I was looking at https://github.com/kubevirt/kubevirt/blob/main/tests/libnet/hotplug.go#L51-L54 and for dynamic changes it looked for vmi.spec.
this code change introduces similar spec comparison approach as VerifyDynamicInterfaceChange which will make checks more reliable.from the testing i did this change behaves as expected in all scenarios
There was a problem hiding this comment.
I'm not against of updating logic, just trying to understand if we really need it and how safe is it.
You are updating the global function which is used in almost all tests in the repo.
However, from my understanding, we can get absent state only when we hot-unplug the interface. If we have any "absent" interface in all other regular scenarios it would sound as a bug which your changes potentially could hide.
There was a problem hiding this comment.
@dshchedr thanks for reviewing this, I really appreciate it. These are very valid questions and I agree we should be thoughtful about the impact since this helper is used across many scenarios. It makes sense to have clear discussion around the behavior change
I looked in code about state: absent ,based on upstream code validation check it can happen only on non default interfaces and never on default interface. admission webhooks checks and rejects for default.
there is a test coverage in that area as well .
There was a problem hiding this comment.
sharing the links I came across while digging into this concept in case they are useful for discussion and can be referred later if needed
Signed-off-by: Geetika Kapoor <gkapoor@redhat.com>
929f98d to
9c9c6e3
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@utilities/virt.py`:
- Line 150: The condition using reported_names == expected_names in the
interface-checking logic (the if that also inspects interfaces and
interface.get("interfaceName")) enforces an exact-set match and will fail when
the guest reports extra interfaces; decide whether you want subset behavior and
if so replace the equality check with expected_names <= reported_names,
otherwise make the exact-match semantics explicit by adding a short comment near
reported_names and expected_names explaining that extra reported interfaces are
treated as a mismatch on purpose; update the condition or comment in the
function containing the interfaces check accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7f4d8786-6af5-4a83-be6c-71c554227058
📒 Files selected for processing (1)
utilities/virt.py
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #3755. Overlapping filesutilities/virt.py |
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #4159. Overlapping filesutilities/virt.py |
| if len(active_interfaces) == len(interfaces): | ||
| return True | ||
| reported_names = set() | ||
| try: |
There was a problem hiding this comment.
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #4404. Overlapping filesutilities/virt.py |
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #4463. Overlapping filesutilities/virt.py |
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #4567. Overlapping filesutilities/virt.py |
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #4746. Overlapping filesutilities/virt.py |
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #4756. Overlapping filesutilities/virt.py |
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #4804. Overlapping filesutilities/virt.py |
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #4788. Overlapping filesutilities/virt.py |
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #4934. Overlapping filesutilities/virt.py |
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #4916. Overlapping filesutilities/virt.py |
|
/retest all Auto-triggered: Files in this PR were modified by merged PR #5103. Overlapping filesutilities/virt.py |
Short description:
Previously the function compared interface counts, making it hard to tell which interface failed to come up after migration.
More details:
with this change , it's easy to identify which interface was missing. Missing interfaces can happen due to hotplug failures , missing nic maybe during migrations.
What this PR does / why we need it:
Which issue(s) this PR fixes:
Special notes for reviewer:
jira-ticket:
Summary by CodeRabbit