Skip to content

[subquery] COUNT-containing scalar subquery rewrite can return wrong results in fallback PostgreSQL optimizer path #378

@Excaliiiibur

Description

@Excaliiiibur

Greenplum version or build

open-gpdb OPENGPDB_STABLE (GP6/PG9.4-based line)

Expected behavior

Correlated scalar aggregate subqueries should preserve empty-input semantics after pull-up rewrite.

If target expression contains COUNT, unmatched rows must evaluate with aggregate empty-input defaults (e.g. COUNT -> 0).

Actual behavior

In fallback PostgreSQL optimizer rewrite path (cdbsubselect.c, convert_EXPR_to_join), COUNT-containing scalar-subquery expressions can return incorrect results on unmatched rows.

This is not ORCA-specific. It affects execution paths that fall back to PostgreSQL planner.

Step to reproduce the behavior

CREATE TABLE t_outer(a int, b int) DISTRIBUTED BY (a);
CREATE TABLE t_inner(b int) DISTRIBUTED BY (b);
INSERT INTO t_outer VALUES (0,1),(0,2);
INSERT INTO t_inner VALUES (1);

set optimizer=off;
SELECT a,b
FROM t_outer o
WHERE (o.a + 1) > (
  SELECT count(*) + 1
  FROM t_inner i
  WHERE i.b = o.b
)
ORDER BY b;

Additional impact scope

Besides optimizer=off, the same risk exists for other query shapes that trigger ORCA fallback to PostgreSQL planner (for example unsupported features/GUC combinations where planning switches to Postgres optimizer).

So the issue scope is: fallback PostgreSQL planner rewrite path, not only manual ORCA-off sessions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions