Merged
Conversation
…#21460) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #21459 ## Rationale for this change When a `ProjectionExec` sits on top of a `FilterExec` that already carries an explicit projection, the `ProjectionPushdown` optimizer attempts to swap them via `try_swapping_with_projection`. The swap replaces the `FilterExec's` input with the narrower `ProjectionExec`, but `FilterExecBuilder::from(self)` carried over the old projection indices (e.g. [0, 1, 2]). After the swap the new input only has the columns selected by the `ProjectionExec` (e.g. 2 columns), so .build() tries to validate the stale projection against the narrower schema and panics with "project index 2 out of bounds, max field 2". ## What changes are included in this PR? In `FilterExec::try_swapping_with_projection`, after replacing the input with the narrower ProjectionExec, clear the FilterExec's own projection via .`apply_projection(None)`. The ProjectionExec that is now the input already handles column selection, so the FilterExec no longer needs its own projection. ## Are these changes tested? yes, add test case ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
## Which issue does this PR close? - Closes #21316. ## Rationale for this change `GROUPING SETS` with duplicate grouping lists were incorrectly collapsed during execution. The internal grouping id only encoded the semantic null mask, so repeated grouping sets shared the same execution key and were merged, which caused rows to be lost compared with PostgreSQL behavior. For example, with: ```sql create table duplicate_grouping_sets(deptno int, job varchar, sal int, comm int); insert into duplicate_grouping_sets values (10, 'CLERK', 1300, null), (20, 'MANAGER', 3000, null); select deptno, job, sal, sum(comm), grouping(deptno), grouping(job), grouping(sal) from duplicate_grouping_sets group by grouping sets ((deptno, job), (deptno, sal), (deptno, job)) order by deptno, job, sal, grouping(deptno), grouping(job), grouping(sal); ``` PostgreSQL preserves the duplicate grouping set and returns: ```text deptno | job | sal | sum | grouping | grouping | grouping --------+---------+------+-----+----------+----------+---------- 10 | CLERK | | | 0 | 0 | 1 10 | CLERK | | | 0 | 0 | 1 10 | | 1300 | | 0 | 1 | 0 20 | MANAGER | | | 0 | 0 | 1 20 | MANAGER | | | 0 | 0 | 1 20 | | 3000 | | 0 | 1 | 0 (6 rows) ``` Before this fix, DataFusion collapsed the duplicate `(deptno, job)` grouping set and returned only 4 rows for the same query shape. ```text +--------+---------+------+-----------------------------------+------------------------------------------+---------------------------------------+---------------------------------------+ | deptno | job | sal | sum(duplicate_grouping_sets.comm) | grouping(duplicate_grouping_sets.deptno) | grouping(duplicate_grouping_sets.job) | grouping(duplicate_grouping_sets.sal) | +--------+---------+------+-----------------------------------+------------------------------------------+---------------------------------------+---------------------------------------+ | 10 | CLERK | NULL | NULL | 0 | 0 | 1 | | 10 | NULL | 1300 | NULL | 0 | 1 | 0 | | 20 | MANAGER | NULL | NULL | 0 | 0 | 1 | | 20 | NULL | 3000 | NULL | 0 | 1 | 0 | +--------+---------+------+-----------------------------------+------------------------------------------+---------------------------------------+---------------------------------------+ ``` ## What changes are included in this PR? - Preserve duplicate grouping sets by packing a duplicate ordinal into the high bits of `__grouping_id`, so repeated occurrences of the same grouping set pattern produce distinct execution keys. - `GROUPING()` now reads the actual `__grouping_id` column type directly from the schema (via `Aggregate::grouping_id_type` rather than inferring bit width from the count of grouping expressions alone. This ensures bitmask literals are correctly sized when duplicate-ordinal bits widen the column type beyond what the expression count would imply. - `GROUPING()` masks off the ordinal bits before returning the result, so the duplicate-ordinal encoding is invisible to user-facing SQL and semantics remain unchanged. - Add regression coverage for the duplicate `GROUPING SETS` case in: - `datafusion/core/tests/sql/aggregates/basic.rs` - `datafusion/sqllogictest/test_files/group_by.slt` ## Are these changes tested? - `cargo fmt --all` - `cargo test -p datafusion duplicate_grouping_sets_are_preserved` - `cargo test -p datafusion-physical-plan grouping_sets_preserve_duplicate_groups` - `cargo test -p datafusion-physical-plan evaluate_group_by_supports_duplicate_grouping_sets_with_eight_columns` - PostgreSQL validation against the same query/result shape ## Are there any user-facing changes? - Yes. Queries that contain duplicate `GROUPING SETS` entries now return the correct duplicated result rows, matching PostgreSQL behavior. --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…#21321) ## Which issue does this PR close? - Closes #21320 ### Rationale for this change When one side of a LEFT/RIGHT/FULL outer join is an EmptyRelation, the current PropagateEmptyRelation optimizer rule leaves the join untouched. This means the engine still builds a hash table for the empty side, probes every row from the non-empty side, finds zero matches, and pads NULLs — all wasted work. The TODO at lines 76-80 of propagate_empty_relation.rs explicitly called out this gap: ``` // TODO: For LeftOut/Full Join, if the right side is empty, the Join can be eliminated // with a Projection with left side columns + right side columns replaced with null values. // For RightOut/Full Join, if the left side is empty, the Join can be eliminated // with a Projection with right side columns + left side columns replaced with null values. ``` ### What changes are included in this PR? Extends the PropagateEmptyRelation rule to handle 4 previously unoptimized cases by replacing the join with a Projection that null-pads the empty side's columns: ### Are these changes tested? Yes. 4 new unit tests added: ### Are there any user-facing changes? No API changes. --------- Co-authored-by: Subham Singhal <subhamsinghal@Subhams-MacBook-Air.local> Co-authored-by: Dmitrii Blaginin <dmitrii@blaginin.me>
Bumps [cryptography](https://github.com/pyca/cryptography) from 46.0.6 to 46.0.7. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst">cryptography's changelog</a>.</em></p> <blockquote> <p>46.0.7 - 2026-04-07</p> <pre><code> * **SECURITY ISSUE**: Fixed an issue where non-contiguous buffers could be passed to APIs that accept Python buffers, which could lead to buffer overflow. **CVE-2026-39892** * Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL 3.5.6. <p>.. _v46-0-6:<br /> </code></pre></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pyca/cryptography/commit/622d672e429a7cff836a23c5903683dbec1901f5"><code>622d672</code></a> 46.0.7 release (<a href="https://redirect.github.com/pyca/cryptography/issues/14602">#14602</a>)</li> <li>See full diff in <a href="https://github.com/pyca/cryptography/compare/46.0.6...46.0.7">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/apache/datafusion/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )