fix: remove unnecessary IgnoreCometNativeDataFusion tags from 3.5.8 diff#3831
Open
fix: remove unnecessary IgnoreCometNativeDataFusion tags from 3.5.8 diff#3831
Conversation
Remove IgnoreCometNativeDataFusion from tests that pass with native_datafusion scan in the 3.5.8 Spark SQL test diff. Also fix ExtractPythonUDFsSuite to match CometNativeScanExec in plan checks, and update DPP test issue reference from #3313 to #3442 for consistency with other diffs. Tests that still need the tag (bucketed read/scan suites) are kept as they require helper method updates to support CometNativeScanExec.
Add CometNativeScanExec to plan node pattern matches in BucketedReadSuite and DisableUnnecessaryBucketedScanSuite helper methods, allowing all #3319 tests to pass with native_datafusion scan.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #3312, closes #3313, closes #3314, closes #3315, closes #3319, closes #3320, closes #3401.
Rationale for this change
Several tests in the 3.5.8 Spark SQL test diff were tagged with
IgnoreCometNativeDataFusionbut actually pass when run withCOMET_PARQUET_SCAN_IMPL=native_datafusion. These tags were only present in the 3.5.8 diff and not in the 3.4.3 or 4.0.1 diffs. In some cases the tests just needed their plan node pattern matches updated to also handleCometNativeScanExec.What changes are included in this PR?
Removed
IgnoreCometNativeDataFusionfrom tests that pass as-is (verified withCOMET_PARQUET_SCAN_IMPL=native_datafusion):ColumnExpressionSuite:input_file_name, input_file_block_start, input_file_block_length - FileScanRDD([native_datafusion] [Spark SQL Tests] input_file_name() returns empty string #3312)UDFSuite:SPARK-8005 input_file_name([native_datafusion] [Spark SQL Tests] input_file_name() returns empty string #3312)HiveUDFSuite:SPARK-11522 select input_file_name from non-parquet table([native_datafusion] [Spark SQL Tests] input_file_name() returns empty string #3312)ExplainSuite:explain formatted - check presence of subquery in case of DPP([native_datafusion] [Spark SQL Tests] Dynamic Partition Pruning (DPP) not working correctly #3313)SQLViewSuite:alter temporary view should follow current storeAnalyzedPlanForView config([native_datafusion] [Spark SQL Tests] Missing files error handling differs from Spark #3314)FileDataSourceV2FallBackSuite:Fallback Parquet V2 to V1([native_datafusion] [Spark SQL Tests] Plan structure differences cause test failures #3315)StreamingQuerySuite:SPARK-41198andSPARK-41199([native_datafusion] [Spark SQL Tests] Plan structure differences cause test failures #3315)ParquetFilterSuite:SPARK-31026andFilters should be pushed down for Parquet readers at row group level([native_datafusion] [Spark SQL Tests] CometNativeExec crash — executed without serialized plan #3320)StreamingSelfUnionSuite: 2 self-union DSv1 tests ([native_datafusion] [Spark SQL Tests] Streaming self-union returns no results #3401)Fixed
ExtractPythonUDFsSuite(#3312) to matchCometNativeScanExecin plan node pattern matches for column pruning and filter pushdown checks.Fixed
BucketedReadSuite(#3319) by addingCometNativeScanExectogetFileScan(),getBucketScan(), and the coalesced bucket scan pattern match.Fixed
DisableUnnecessaryBucketedScanSuite(#3319) by addingCometNativeScanExectocheckNumBucketedScan().Updated
DynamicPartitionPruningSuiteissue reference from #3313 to #3442 for consistency with the 3.4.3 and 4.0.1 diffs.How are these changes tested?
Each test was run individually with
ENABLE_COMET=true ENABLE_COMET_ONHEAP=true COMET_PARQUET_SCAN_IMPL=native_datafusionagainst Apache Spark 3.5.8 with the updated diff applied. All tests passed.