Migrate PhysicalExprAdapter to unified CastExpr and remove CastColumnExpr usage#21493
Open
kosiew wants to merge 5 commits intoapache:mainfrom
Open
Migrate PhysicalExprAdapter to unified CastExpr and remove CastColumnExpr usage#21493kosiew wants to merge 5 commits intoapache:mainfrom
kosiew wants to merge 5 commits intoapache:mainfrom
Conversation
Update adapter rewriter to emit field-aware CastExpr instead of CastColumnExpr in schema_rewriter.rs. Rename helper from create_cast_column_expr to create_cast_expr. Adjust adapter tests to validate unified CastExpr behavior and maintain target_field() metadata. Modify one test to check nullability via return_field() rather than the old wrapper’s nullability approach.
Change create_cast_expr to accept physical DataType instead of FieldRef to align with the unified CastExpr adapter. Replace the outdated helper-only regression test with test_rewrite_resolves_physical_column_by_name_before_casting. This new test verifies that the name-based resolution still correctly addresses stale column indices prior to building the cast expression.
Inline cast helper in rewrite_column and streamline the column/field matching logic for clarity. Simplify the construction of results in resolve_physical_column. Reduce test duplication with local helpers for CastExpr and inner Column assertions. Utilize a helper adapter factory to reuse stale-index test setup. Simplify metadata/nullability test by asserting the return_field() contract, and replace complex string-based expectations with precise structural assertions.
Simplify the control flow in the rewrite_column fast path with straightforward if checks around fields_match. Replace make_stale_index_cast_adapter() with a reusable stale_index_cast_schemas() fixture, allowing tests to choose between schemas or an adapter as needed.
- Refactored the allocation of `physical_field` for better clarity. - Enhanced the readability of the metadata assertions in tests for improved code understanding.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
The current adapter emits
CastColumnExpr, duplicating functionality already provided byCastExpr. Maintaining two cast representations introduces unnecessary complexity, branching logic, and potential inconsistencies in behavior depending on where casts are constructed.With recent improvements to
CastExpr(field-aware casting), it is now capable of preserving logical field metadata, nullability, and type semantics. This enables the adapter to emit a single, unified cast representation.This change simplifies the expression layer, reduces maintenance overhead, and ensures consistent casting behavior across the execution pipeline.
What changes are included in this PR?
Replace all usages of
CastColumnExprinschema_rewriter.rswithCastExpr.Remove the
create_cast_column_exprhelper and inline its logic usingCastExpr::new_with_target_field.Add validation via
validate_data_type_compatibilitybefore constructing cast expressions.Improve rewrite logic:
Ensure physical column resolution is based on column name rather than index.
Update tests to:
CastExprinstead ofCastColumnExpr.return_field.Add helper assertions for validating cast expressions in tests.
Are these changes tested?
Yes.
Existing adapter and schema evolution tests have been updated to use
CastExpr.New assertions validate:
CastExprwhen required.Regression coverage added for stale column index scenarios.
Are there any user-facing changes?
No direct user-facing changes.
This is an internal refactor that unifies cast expression handling. However, it improves consistency and correctness of schema evolution and expression rewriting, which may indirectly benefit users.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.