Skip to content

[GLUTEN-11550][UT] Enable 30 disabled test suites for Spark 4.0/4.1#11816

Open
baibaichen wants to merge 12 commits intoapache:mainfrom
baibaichen:fix/11550-enable-all
Open

[GLUTEN-11550][UT] Enable 30 disabled test suites for Spark 4.0/4.1#11816
baibaichen wants to merge 12 commits intoapache:mainfrom
baibaichen:fix/11550-enable-all

Conversation

@baibaichen
Copy link
Contributor

@baibaichen baibaichen commented Mar 24, 2026

What changes were proposed in this pull request?

Enable 13 test suites that were disabled in #11800 by fixing root causes:

Real fixes (7 suites): Suites that create per-test SparkContexts conflicted with GlutenPlugin's persistent session. Fixed by switching to GlutenTestsCommonTrait + injecting GlutenPlugin via System.setProperty so each test's SparkSession loads it.

Suite Fix Tests Passed
GlutenSQLExecutionSuite System.setProperty injection 7
GlutenSQLJsonProtocolSuite System.setProperty injection 3
GlutenShufflePartitionsUtilSuite System.setProperty injection 9
GlutenExternalAppendOnlyUnsafeRowArraySuite System.setProperty injection 14
GlutenUnsafeRowSerializerSuite System.setProperty injection 5
GlutenCsvExpressionsSuite Override test (AnalysisException vs TestFailedException) 18
GlutenSparkPlanSuite Override test (find VeloxColumnarToRowExec) 9

Targeted excludes (6 suites): Enabled with .exclude() for specific tests that cannot pass under Gluten:

Suite Excluded Test(s) Reason Tests Passed
GlutenGroupBasedUpdateTableSuite update with NOT NULL checks Velox exception type differs 33
GlutenWholeTextFileV1Suite reading text file with option wholetext=true jar:file: URI 2
GlutenWholeTextFileV2Suite reading text file with option wholetext=true jar:file: URI 2
GlutenScalaUDFSuite variant basic output variant Spark 4.1 Variant encoder bug 6
GlutenExpressionEvalHelperSuite 2 SPARK-16489/25388 tests Gluten checkEvaluation behavior 1
GlutenToPrettyStringSuite Timestamp as pretty strings Velox timezone 22

How was this patch tested?

All 13 suites verified on spark41 via run-scala-test.sh --mvnd. SparkContext-conflict suites also verified on spark40.

Related issue: #11550

@github-actions github-actions bot added the CORE works for Gluten Core label Mar 24, 2026
@baibaichen baibaichen changed the title [GLUTEN-11550][UT] Enable 13 disabled test suites for Spark 4.0/4.1 [GLUTEN-11550][UT] Enable 30 disabled test suites for Spark 4.0/4.1 Mar 24, 2026
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

baibaichen and others added 11 commits March 26, 2026 02:51
Inject GlutenPlugin via System.setProperty so per-test SparkSessions
created by the parent suite load it. Use GlutenTestsCommonTrait to
avoid creating a persistent session that conflicts with per-test
SparkContext creation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tionsUtilSuite, GlutenExternalAppendOnlyUnsafeRowArraySuite

Same pattern as GlutenSQLExecutionSuite: these suites create per-test
SparkContexts which conflict with GlutenTestsTrait's persistent session.
Switch to GlutenTestsCommonTrait + inject GlutenPlugin via System.setProperty
so each test's SparkSession loads it.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Override 'unsupported mode' test: Gluten's DataFrame-based checkEvaluation
throws AnalysisException directly instead of wrapping it in
TestFailedException. The overridden test intercepts AnalysisException.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use System.setProperty to inject GlutenPlugin into per-test SparkSessions
created by parent suite (LocalSparkSession pattern).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Override SPARK-37779 test to find VeloxColumnarToRowExec (extends
ColumnarToRowExecBase) instead of ColumnarToRowExec.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Enable suite with .exclude for 'update with NOT NULL checks' — Velox
throws VeloxUserError wrapped as SparkException instead of the expected
SparkRuntimeException. Root cause: Velox native NOT NULL check uses
different exception chain than Spark's AssertNotNull expression.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Enable with .exclude() for specific failing tests:
- GlutenWholeTextFileV1Suite: exclude jar:file: URI test
- GlutenWholeTextFileV2Suite: exclude jar:file: URI test
- GlutenScalaUDFSuite: exclude Variant encoder test (Spark 4.1 bug)
- GlutenExpressionEvalHelperSuite: exclude 2 tests (different failure names)
- GlutenToPrettyStringSuite: exclude timestamp timezone test (Velox UTC)
- GlutenGroupBasedUpdateTableSuite: exclude NOT NULL exception type test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Enable plan-structure and other suites with .exclude() for failing tests:

| Suite | Passed | Excluded | Root Cause |
|-------|:------:|:--------:|------------|
| GlutenJoinHintSuite | 17 | 1 | CartesianProduct not supported |
| GlutenExplainSuite | 20 | 4 | WholeStageCodegen/FileScan in explain |
| GlutenDataSourceScanExecRedactionSuite | 2 | 2 | FileScan replaced by native scan |
| GlutenDataSourceV2ScanExecRedactionSuite | 1 | 2 | BatchScan replaced |
| GlutenInsertSortForLimitAndOffsetSuite | 0 | 6 | TakeOrderedAndProject replaced |
| GlutenProjectedOrderingAndPartitioningSuite | 1 | 9 | SinglePartition vs HashPartitioning |
| GlutenRemoveRedundantProjectsSuite | 3 | 14 | Plan tree structure differs |
| GlutenSimpleSQLViewSuite | 52 | 2 | Error condition + query result |
| GlutenPlannerSuite | 55 | 21 | TakeOrderedAndProject/sort/partitioning |
| GlutenRemoveRedundantSortsSuite | 0 | 5 | SortExec replaced |
| GlutenObjectExpressionsSuite | 13 | 15 | Spark 4.1 encoder API change |

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…engine excludes

- GlutenOrderingSuite: 54 passed, 2 TimeType excluded
- GlutenHiveResultSuite: 10 passed, 1 TimeType excluded
- GlutenCollationRegexpExpressionsSuite: 3 passed, 1 Velox split excluded
- GlutenColumnarRulesSuite: 2 passed, 1 Transition excluded
- RandomDataGeneratorSuite: kept TODO (232 TimeType failures, impractical to exclude individually)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rait import

spark40 does not have org.apache.spark.sql.shim.GlutenTestsTrait (that shim
only exists for spark41). Use org.apache.spark.sql.GlutenTestsTrait instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Root cause: Gluten's offload rules replace Spark physical plan nodes with
Transformer nodes but don't propagate LOGICAL_PLAN_TAG. This tag is used by
Spark's LogicalPlanTagInSparkPlanSuite to verify logical-physical plan linkage.

Three core fixes:
1. LegacyOffload: propagate LOGICAL_PLAN_TAG from original node to offloaded
   Transformer node using setTagValue (non-recursive to avoid tagging Exchange).
2. HeuristicTransform.Simple: same tag propagation for the simple offload path.
3. PushDownFilterToScan: copyTagsFrom when creating new scan via
   withNewPushdownFilters (case class copy loses tags).

Test suite overrides checkGeneratedCode with Gluten-aware version that:
- Recognizes Transformer node types (joins, aggregates, windows, scans, etc.)
- For scan trees, finds logical plan tag from any node in the tree (not just
  root), since rewrite rules may create new Project/Filter without tags.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@baibaichen baibaichen force-pushed the fix/11550-enable-all branch from 5782b90 to 451760d Compare March 26, 2026 07:30
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

…m.setProperty

The parent suite extends SparkFunSuite with LocalSparkContext, creating its own
SparkContext per test. Using GlutenTestsTrait creates a shared SparkSession in
beforeAll() that conflicts with per-test SparkContext creation.

Fix: Use GlutenTestsCommonTrait + System.setProperty pattern (same as
GlutenSQLExecutionSuite) so per-test SparkContexts inherit GlutenPlugin config
via system properties without session conflicts.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant