Skip to content

Increase timeout for skipped_ut job and add first slow test case.#3252

Merged
chuanqi129 merged 2 commits intomainfrom
bbela/introduce-slow-tests
Apr 15, 2026
Merged

Increase timeout for skipped_ut job and add first slow test case.#3252
chuanqi129 merged 2 commits intomainfrom
bbela/introduce-slow-tests

Conversation

@BBBela
Copy link
Copy Markdown
Contributor

@BBBela BBBela commented Apr 1, 2026

Fix for: #3106

Overview

This PR increases timeout for skipped_ut job.
It also adds first very long test case to the skip_dict. This way it is not being run in the every pull workflow, but is launched in every Nightly workflow with increased timeout.

Problem description

The problem was that the test case:
op_ut,third_party.torch-xpu-ops.test.xpu.test_decomp_xpu.TestDecompXPU,test_quick_core_backward_baddbmm_xpu_float64
was failing in the CI because the timeout mechanism was stopping the worker before test case succeeded. Currently timeout for this test case was set to 10 minutes, while it takes much more than that (about 24 minutes) to succeed.

Changes

  1. Increase timeout for skipped_ut job from 10 minutes to 60 minutes.
  2. Add first discovered very long test case to the skip_dict.

@BBBela BBBela added disable_all Disable all ci test jobs for the PR, just keep basic lint check disable_e2e Disable all e2e test jobs for the PR disable_distributed Disable distributed UT test jobs for the PR disable_accelerate Disable accelerate test job in PR CI testing disable_transformers Disable transformers UT test in PR CI disable_build Disable source code build for CI test, use nightly wheel and removed disable_all Disable all ci test jobs for the PR, just keep basic lint check labels Apr 1, 2026
@BBBela BBBela changed the title [WIP] Introduce op_ut_slow job for very slow tests. Introduce op_ut_slow job for very slow tests. Apr 1, 2026
@BBBela BBBela marked this pull request as ready for review April 1, 2026 14:17
Copilot AI review requested due to automatic review settings April 1, 2026 14:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a dedicated CI lane to run extremely slow XPU unit tests (e.g., >10 minutes) separately from the default op_ut run, so normal PR pipelines aren’t blocked by long-running cases.

Changes:

  • Refactors test/xpu/skip_list_common.py to separate “tests to launch” (launch_list) from “tests to skip” (just_skip_dict), while keeping skip_dict in the legacy shape for compatibility.
  • Adds slow_dict + launcher support (--test-cases slow) to run only very slow test cases in a dedicated job while excluding them from default selected runs.
  • Updates CI workflows/actions to add an op_ut_long scope and include it in the nightly scheduled UT matrix with an increased pytest timeout.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
test/xpu/skip_list_common.py Introduces launch_list, just_skip_dict, and slow_dict, and builds compatibility skip_dict + a combined exclusion dict.
test/xpu/run_test_with_skip.py Adds slow selection mode and updates selected behavior to exclude slow tests via the combined exclusion dict.
.github/workflows/nightly_ondemand.yml Adds op_ut_long to the scheduled UT matrix.
.github/workflows/_linux_ut.yml Documents op_ut_long as a valid UT scope input.
.github/scripts/ut_result_check.sh Adds op_ut_long to the suite dispatcher / available suites list.
.github/actions/linux-uttest/action.yml Adds an op_ut_long step that runs --test-cases slow and increases pytest timeout for that step.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/scripts/ut_result_check.sh
Comment thread test/xpu/run_test_with_skip.py
Comment thread test/xpu/skip_list_common.py Outdated
Comment thread .github/workflows/nightly_ondemand.yml
Comment thread .github/actions/linux-uttest/action.yml Outdated
@BBBela BBBela changed the title Introduce op_ut_slow job for very slow tests. Introduce op_ut_long job for very slow tests. Apr 1, 2026
@BBBela BBBela added the disable_all Disable all ci test jobs for the PR, just keep basic lint check label Apr 1, 2026
@BBBela BBBela requested a review from Copilot April 1, 2026 15:17
@BBBela BBBela changed the title Introduce op_ut_long job for very slow tests. Introduce op_ut_slow job for very slow tests. Apr 1, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/xpu/run_test_with_skip.py Outdated
Comment thread test/xpu/skip_list_common.py Outdated
Comment thread .github/workflows/_linux_ut.yml
Copilot AI review requested due to automatic review settings April 1, 2026 15:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/xpu/run_test_with_skip.py Outdated
Comment thread .github/actions/linux-uttest/action.yml Outdated
@BBBela BBBela removed the disable_all Disable all ci test jobs for the PR, just keep basic lint check label Apr 1, 2026
@BBBela BBBela removed the disable_build Disable source code build for CI test, use nightly wheel label Apr 2, 2026
@BBBela
Copy link
Copy Markdown
Contributor Author

BBBela commented Apr 7, 2026

Hello @mengfei25, @RUIJIEZHONG66166,
could you please look at this PR and review when you have time? Thanks!
That is just a proposal to resolve problem described in the linked issue #3106

Signed-off-by: Benedykt Bela <benedykt.bela@intel.com>
@BBBela BBBela force-pushed the bbela/introduce-slow-tests branch from 718ede4 to 6098aa8 Compare April 13, 2026 08:01
Copilot AI review requested due to automatic review settings April 13, 2026 08:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/xpu/skip_list_common.py
Comment thread .github/actions/linux-uttest/action.yml
@BBBela BBBela changed the title Introduce op_ut_slow job for very slow tests. Increase timeout for skipped_ut job and add first slow test case. Apr 13, 2026
Copy link
Copy Markdown
Contributor

@mengfei25 mengfei25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and please launch a test to check whether your changes works or not

Comment thread .github/actions/linux-uttest/action.yml
@BBBela
Copy link
Copy Markdown
Contributor Author

BBBela commented Apr 14, 2026

I confirm that in the On-demand workflow from this PR the mentioned test case is being launched and is passing:
https://github.com/intel/torch-xpu-ops/actions/runs/24334802146/job/71235318294#step:4:87

@BBBela
Copy link
Copy Markdown
Contributor Author

BBBela commented Apr 15, 2026

I also confirm that in this PRs pull workflow, the test case
test_quick_core_backward_baddbmm_xpu_float64
is being correctly skipped and not run.
So to sum up - the behavior is correct and as expected. The added test case is skipped in pull workflow but launched in the Nightly skipped_ut job with increased timeout and it pass.

@BBBela
Copy link
Copy Markdown
Contributor Author

BBBela commented Apr 15, 2026

/merge

@chuanqi129 chuanqi129 merged commit fe9cfc0 into main Apr 15, 2026
37 of 38 checks passed
@chuanqi129
Copy link
Copy Markdown
Contributor

✅ PR has been successfully merged by @BBBela.

@chuanqi129 chuanqi129 deleted the bbela/introduce-slow-tests branch April 15, 2026 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disable_accelerate Disable accelerate test job in PR CI testing disable_distributed Disable distributed UT test jobs for the PR disable_e2e Disable all e2e test jobs for the PR disable_transformers Disable transformers UT test in PR CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Worker crashes when running TestDecompXPU,test_quick_core_backward_baddbmm_xpu_float64 in CI.

5 participants