Increase timeout for skipped_ut job and add first slow test case.#3252
Increase timeout for skipped_ut job and add first slow test case.#3252chuanqi129 merged 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a dedicated CI lane to run extremely slow XPU unit tests (e.g., >10 minutes) separately from the default op_ut run, so normal PR pipelines aren’t blocked by long-running cases.
Changes:
- Refactors
test/xpu/skip_list_common.pyto separate “tests to launch” (launch_list) from “tests to skip” (just_skip_dict), while keepingskip_dictin the legacy shape for compatibility. - Adds
slow_dict+ launcher support (--test-cases slow) to run only very slow test cases in a dedicated job while excluding them from default selected runs. - Updates CI workflows/actions to add an
op_ut_longscope and include it in the nightly scheduled UT matrix with an increased pytest timeout.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
test/xpu/skip_list_common.py |
Introduces launch_list, just_skip_dict, and slow_dict, and builds compatibility skip_dict + a combined exclusion dict. |
test/xpu/run_test_with_skip.py |
Adds slow selection mode and updates selected behavior to exclude slow tests via the combined exclusion dict. |
.github/workflows/nightly_ondemand.yml |
Adds op_ut_long to the scheduled UT matrix. |
.github/workflows/_linux_ut.yml |
Documents op_ut_long as a valid UT scope input. |
.github/scripts/ut_result_check.sh |
Adds op_ut_long to the suite dispatcher / available suites list. |
.github/actions/linux-uttest/action.yml |
Adds an op_ut_long step that runs --test-cases slow and increases pytest timeout for that step. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Hello @mengfei25, @RUIJIEZHONG66166, |
Signed-off-by: Benedykt Bela <benedykt.bela@intel.com>
718ede4 to
6098aa8
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mengfei25
left a comment
There was a problem hiding this comment.
LGTM, and please launch a test to check whether your changes works or not
|
I confirm that in the |
|
I also confirm that in this PRs pull workflow, the test case |
|
/merge |
|
✅ PR has been successfully merged by @BBBela. |
Fix for: #3106
Overview
This PR increases timeout for
skipped_utjob.It also adds first very long test case to the
skip_dict. This way it is not being run in the everypullworkflow, but is launched in everyNightlyworkflow with increased timeout.Problem description
The problem was that the test case:
op_ut,third_party.torch-xpu-ops.test.xpu.test_decomp_xpu.TestDecompXPU,test_quick_core_backward_baddbmm_xpu_float64was failing in the CI because the timeout mechanism was stopping the worker before test case succeeded. Currently timeout for this test case was set to 10 minutes, while it takes much more than that (about 24 minutes) to succeed.
Changes
skipped_utjob from 10 minutes to 60 minutes.