Increase VolumeSnapshot wait timeout from 240s to 6 minutes#3972
Increase VolumeSnapshot wait timeout from 240s to 6 minutes#3972Ahmad-Hafe wants to merge 1 commit intoRedHatQE:mainfrom
Conversation
54ed2cb to
3cffa8b
Compare
3cffa8b to
452034c
Compare
452034c to
f0f1ab4
Compare
|
Report bugs in Issues Welcome! 🎉This pull request will be automatically processed with the following features: 🔄 Automatic Actions
📋 Available CommandsPR Status Management
Review & Approval
Testing & Validation
Container Operations
Cherry-pick Operations
Label Management
✅ Merge RequirementsThis PR will be automatically approved when the following conditions are met:
📊 Review ProcessApprovers and ReviewersApprovers:
Reviewers:
Available Labels
💡 Tips
For more information, please refer to the project documentation or contact the maintainers. |
|
/build-and-push-container |
|
/lgtm |
📝 WalkthroughWalkthroughTwo distinct changes address test reliability and timeout configuration: removal of an expected-failure marker from a data import test, and introduction of an explicit six-minute timeout for volume snapshot operations. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@utilities/storage.py`:
- Line 1034: Replace the use of the timeout constant with an explicit numeric
value: change the call to volume_snapshot.wait(timeout=TIMEOUT_6MIN) so it
passes the literal integer 360 instead of TIMEOUT_6MIN; update the invocation of
volume_snapshot.wait to timeout=360 to follow the repo convention and keep the
rest of the call intact.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
tests/storage/test_data_import_cron.pyutilities/storage.py
💤 Files with no reviewable changes (1)
- tests/storage/test_data_import_cron.py
|
New container for quay.io/openshift-cnv/openshift-virtualization-tests:pr-3972 published |
|
/verified re-run the test 100 times consecutively using --repeat-scope=session --count=100 |
Test execution plan request detailsCRITICAL: You MUST post an inline review comment on the first changed line of the first file. As an expert software testing engineer, analyze all modified files in this PR and create a targeted test execution plan. Analysis Requirements:
Your deliverable: Test Execution Plan
Guidelines:
Hardware-Related Checks (SR-IOV, GPU, DPDK): When PR modifies fixtures for hardware-specific resources:
CRITICAL WORKFLOW COMPLETION RULES: When responding to this test execution plan request, you MUST follow these rules EXACTLY:
ABSOLUTE PROHIBITIONS (violating these creates empty/meaningless reviews):
Remember: The pull request review is visible to users. Additional comments are redundant noise. CRITICAL: Before setting "Run smoke tests: True", you MUST use shell scripts to: |
Ahmad-Hafe
left a comment
There was a problem hiding this comment.
we have already TIMEOUT_6MIN
in the file being used in 'wait_for_succeeded_dv' function
using TIMEOUT_6MIN is better idea
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 4175 🏁 Script executed: Length of output: 1813 🏁 Script executed: Length of output: 373 🏁 Script executed: Length of output: 1797 🧠 Learnings used |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3972 +/- ##
==========================================
+ Coverage 98.56% 98.59% +0.03%
==========================================
Files 25 25
Lines 2297 2357 +60
==========================================
+ Hits 2264 2324 +60
Misses 33 33
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| LOGGER.info(f"Wait for VolumeSnapshot '{name}' in '{namespace}' to be '{ready_to_use_status}'") | ||
| volume_snapshot = VolumeSnapshot(namespace=namespace, name=name) | ||
| volume_snapshot.wait() | ||
| volume_snapshot.wait(timeout=TIMEOUT_6MIN) |
There was a problem hiding this comment.
increasing the timeout by 2 minutes on a test which failed a month ago and the failure could not be reproduced is odd to me
@jpeimer @dalia-frank wdyt?
There was a problem hiding this comment.
@rnetser I agree
my first approach was to simply just unquarantined the test after hunders of tries to reproduce.
I added 2 min just because in the must-gather I saw the the PVC took it around 5 min to provisin, so I considered that and raised the time up to 6 min in total (+1 plus to volumesnapshot)
@jpeimer @dalia-frank
please tell wdyt?
|
/lgtm |
Short description:
Increase VolumeSnapshot wait timeout from 240s to 6 minutes
More details:
The wait_for_volume_snapshot_ready_to_use() function in uses the default 240-second timeout from This timeout is occasionally insufficient, there is scanrio where pvc took 5M to Provisioning
here is example from must-gther:
Normal Provisioning 4m55s openshift-storage.rbd.csi.ceph.com_openshift-storage.rbd.csi.ceph.com-ctrlplugin-8c7b79947-qlnlf_bc7ddaa7-a128-4f50-a36f-4d8473bb12dd External provisioner is provisioning volume for claim "test-data-import-cron/prime-bb044725-2764-40fd-b6d9-3c918131fefa"
Normal ProvisioningSucceeded 4m55s openshift-storage.rbd.csi.ceph.com_openshift-storage.rbd.csi.ceph.com-ctrlplugin-8c7b79947-qlnlf_bc7ddaa7-a128-4f50-a36f-4d8473bb12dd Successfully
What this PR does / why we need it:
Changes
volume_snapshot.wait()tovolume_snapshot.wait(timeout=TIMEOUT_6MIN)Fixes flaky test
test_data_import_cron_garbage_collectionwhich fails ~1-2 times in the last monthWhich issue(s) this PR fixes:
Fixes flaky test failure in
test_data_import_cron_garbage_collectionwhereTimeoutExpiredError: Timed Out: 240occurs during VolumeSnapshot creation.Special notes for reviewer:
please see the must-gather in jira card (attached)
jira-ticket:
https://issues.redhat.com/browse/CNV-75955
Summary by CodeRabbit
Bug Fixes
Tests