[GLUTEN-11550][UT] Enable 30 disabled test suites for Spark 4.0/4.1 by baibaichen · Pull Request #11816 · apache/gluten

baibaichen · 2026-03-24T07:25:06Z

What changes were proposed in this pull request?

Enable 13 test suites that were disabled in #11800 by fixing root causes:

Real fixes (7 suites): Suites that create per-test SparkContexts conflicted with GlutenPlugin's persistent session. Fixed by switching to GlutenTestsCommonTrait + injecting GlutenPlugin via System.setProperty so each test's SparkSession loads it.

Suite	Fix	Tests Passed
GlutenSQLExecutionSuite	System.setProperty injection	7
GlutenSQLJsonProtocolSuite	System.setProperty injection	3
GlutenShufflePartitionsUtilSuite	System.setProperty injection	9
GlutenExternalAppendOnlyUnsafeRowArraySuite	System.setProperty injection	14
GlutenUnsafeRowSerializerSuite	System.setProperty injection	5
GlutenCsvExpressionsSuite	Override test (AnalysisException vs TestFailedException)	18
GlutenSparkPlanSuite	Override test (find VeloxColumnarToRowExec)	9

Targeted excludes (6 suites): Enabled with .exclude() for specific tests that cannot pass under Gluten:

Suite	Excluded Test(s)	Reason	Tests Passed
GlutenGroupBasedUpdateTableSuite	`update with NOT NULL checks`	Velox exception type differs	33
GlutenWholeTextFileV1Suite	`reading text file with option wholetext=true`	jar:file: URI	2
GlutenWholeTextFileV2Suite	`reading text file with option wholetext=true`	jar:file: URI	2
GlutenScalaUDFSuite	`variant basic output variant`	Spark 4.1 Variant encoder bug	6
GlutenExpressionEvalHelperSuite	2 SPARK-16489/25388 tests	Gluten checkEvaluation behavior	1
GlutenToPrettyStringSuite	`Timestamp as pretty strings`	Velox timezone	22

How was this patch tested?

All 13 suites verified on spark41 via run-scala-test.sh --mvnd. SparkContext-conflict suites also verified on spark40.

Related issue: #11550

github-actions · 2026-03-25T06:41:18Z

Run Gluten Clickhouse CI on x86

Inject GlutenPlugin via System.setProperty so per-test SparkSessions created by the parent suite load it. Use GlutenTestsCommonTrait to avoid creating a persistent session that conflicts with per-test SparkContext creation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…tionsUtilSuite, GlutenExternalAppendOnlyUnsafeRowArraySuite Same pattern as GlutenSQLExecutionSuite: these suites create per-test SparkContexts which conflict with GlutenTestsTrait's persistent session. Switch to GlutenTestsCommonTrait + inject GlutenPlugin via System.setProperty so each test's SparkSession loads it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Override 'unsupported mode' test: Gluten's DataFrame-based checkEvaluation throws AnalysisException directly instead of wrapping it in TestFailedException. The overridden test intercepts AnalysisException. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Use System.setProperty to inject GlutenPlugin into per-test SparkSessions created by parent suite (LocalSparkSession pattern). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Override SPARK-37779 test to find VeloxColumnarToRowExec (extends ColumnarToRowExecBase) instead of ColumnarToRowExec. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Enable suite with .exclude for 'update with NOT NULL checks' — Velox throws VeloxUserError wrapped as SparkException instead of the expected SparkRuntimeException. Root cause: Velox native NOT NULL check uses different exception chain than Spark's AssertNotNull expression. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Enable with .exclude() for specific failing tests: - GlutenWholeTextFileV1Suite: exclude jar:file: URI test - GlutenWholeTextFileV2Suite: exclude jar:file: URI test - GlutenScalaUDFSuite: exclude Variant encoder test (Spark 4.1 bug) - GlutenExpressionEvalHelperSuite: exclude 2 tests (different failure names) - GlutenToPrettyStringSuite: exclude timestamp timezone test (Velox UTC) - GlutenGroupBasedUpdateTableSuite: exclude NOT NULL exception type test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Enable plan-structure and other suites with .exclude() for failing tests: | Suite | Passed | Excluded | Root Cause | |-------|:------:|:--------:|------------| | GlutenJoinHintSuite | 17 | 1 | CartesianProduct not supported | | GlutenExplainSuite | 20 | 4 | WholeStageCodegen/FileScan in explain | | GlutenDataSourceScanExecRedactionSuite | 2 | 2 | FileScan replaced by native scan | | GlutenDataSourceV2ScanExecRedactionSuite | 1 | 2 | BatchScan replaced | | GlutenInsertSortForLimitAndOffsetSuite | 0 | 6 | TakeOrderedAndProject replaced | | GlutenProjectedOrderingAndPartitioningSuite | 1 | 9 | SinglePartition vs HashPartitioning | | GlutenRemoveRedundantProjectsSuite | 3 | 14 | Plan tree structure differs | | GlutenSimpleSQLViewSuite | 52 | 2 | Error condition + query result | | GlutenPlannerSuite | 55 | 21 | TakeOrderedAndProject/sort/partitioning | | GlutenRemoveRedundantSortsSuite | 0 | 5 | SortExec replaced | | GlutenObjectExpressionsSuite | 13 | 15 | Spark 4.1 encoder API change | Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…engine excludes - GlutenOrderingSuite: 54 passed, 2 TimeType excluded - GlutenHiveResultSuite: 10 passed, 1 TimeType excluded - GlutenCollationRegexpExpressionsSuite: 3 passed, 1 Velox split excluded - GlutenColumnarRulesSuite: 2 passed, 1 Transition excluded - RandomDataGeneratorSuite: kept TODO (232 TimeType failures, impractical to exclude individually) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…rait import spark40 does not have org.apache.spark.sql.shim.GlutenTestsTrait (that shim only exists for spark41). Use org.apache.spark.sql.GlutenTestsTrait instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Root cause: Gluten's offload rules replace Spark physical plan nodes with Transformer nodes but don't propagate LOGICAL_PLAN_TAG. This tag is used by Spark's LogicalPlanTagInSparkPlanSuite to verify logical-physical plan linkage. Three core fixes: 1. LegacyOffload: propagate LOGICAL_PLAN_TAG from original node to offloaded Transformer node using setTagValue (non-recursive to avoid tagging Exchange). 2. HeuristicTransform.Simple: same tag propagation for the simple offload path. 3. PushDownFilterToScan: copyTagsFrom when creating new scan via withNewPushdownFilters (case class copy loses tags). Test suite overrides checkGeneratedCode with Gluten-aware version that: - Recognizes Transformer node types (joins, aggregates, windows, scans, etc.) - For scan trees, finds logical plan tag from any node in the tree (not just root), since rewrite rules may create new Project/Filter without tags. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-03-26T07:31:23Z

Run Gluten Clickhouse CI on x86

…m.setProperty The parent suite extends SparkFunSuite with LocalSparkContext, creating its own SparkContext per test. Using GlutenTestsTrait creates a shared SparkSession in beforeAll() that conflicts with per-test SparkContext creation. Fix: Use GlutenTestsCommonTrait + System.setProperty pattern (same as GlutenSQLExecutionSuite) so per-test SparkContexts inherit GlutenPlugin config via system properties without session conflicts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-03-26T10:35:53Z

Run Gluten Clickhouse CI on x86

github-actions bot added the CORE works for Gluten Core label Mar 24, 2026

baibaichen changed the title ~~[GLUTEN-11550][UT] Enable 13 disabled test suites for Spark 4.0/4.1~~ [GLUTEN-11550][UT] Enable 30 disabled test suites for Spark 4.0/4.1 Mar 24, 2026

baibaichen and others added 11 commits March 26, 2026 02:51

[GLUTEN-11550][UT] Fix GlutenUnsafeRowSerializerSuite

2a2f55a

Use System.setProperty to inject GlutenPlugin into per-test SparkSessions created by parent suite (LocalSparkSession pattern). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

[GLUTEN-11550][UT] Fix GlutenSparkPlanSuite

e3a9cfe

Override SPARK-37779 test to find VeloxColumnarToRowExec (extends ColumnarToRowExecBase) instead of ColumnarToRowExec. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

baibaichen force-pushed the fix/11550-enable-all branch from 5782b90 to 451760d Compare March 26, 2026 07:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GLUTEN-11550][UT] Enable 30 disabled test suites for Spark 4.0/4.1#11816

[GLUTEN-11550][UT] Enable 30 disabled test suites for Spark 4.0/4.1#11816
baibaichen wants to merge 12 commits intoapache:mainfrom
baibaichen:fix/11550-enable-all

baibaichen commented Mar 24, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 25, 2026

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

baibaichen commented Mar 24, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

github-actions bot commented Mar 25, 2026

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

baibaichen commented Mar 24, 2026 •

edited by github-actions bot

Loading