Skip to content

fix: enforce nullable types for null literals in test cases#989

Merged
benbellick merged 10 commits intomainfrom
benbellick/nullable-null-literals
Mar 20, 2026
Merged

fix: enforce nullable types for null literals in test cases#989
benbellick merged 10 commits intomainfrom
benbellick/nullable-null-literals

Conversation

@benbellick
Copy link
Copy Markdown
Member

@benbellick benbellick commented Mar 4, 2026

Test cases now respect the nullability rules defined by each function's nullability handling in the extension YAML. MIRROR functions require the output to be marked nullable (?) iff any argument is nullable, and DECLARED_OUTPUT functions require the output nullability to match the declared return type.

The Python test infrastructure now enforces these rules automatically via test_substrait_nullability_consistency. This drove the ~105 test case line changes to add missing ? markers on return types. The visitor was also fixed to stop silently dropping nullable markers for several types, and to distinguish outer nullability (list?<i32>) from inner element nullability (list<i32?>).

DISCRETE nullability handling only validates output type for now — argument nullability matching is not yet implemented. This is fine since no extensions currently use DISCRETE.


Note: These changes were made with Claude's assistance. All code has been reviewed by me.

@benbellick benbellick force-pushed the benbellick/nullable-null-literals branch 2 times, most recently from a7c8cad to 3a1ba99 Compare March 4, 2026 19:27
@staticmethod
def is_same_type(func_arg_type, arg_type):
arg_type_base = arg_type.split("<")[0]
arg_type_base = arg_type.split("<")[0].rstrip("?")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with this part of repo... but this looks like comparing only the base type. is_same_type is the proper name for this function? also, shouldn't it consider DISCRETE nullability https://substrait.io/expressions/scalar_functions/#nullability-handling?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so familiar with this part of it either. At one point in history I was looking into rewriting quite a bit of it to use a more typed representation, but ended up focusing on other things.

Yes, I think it is only comparing the base type. Though that was the behavior before this PR 🤷. I agree it should consider DISCRETE nullability. Is it possible to leave those sort of validations for another PR?

I think there is a lot of room for improvement in these test cases, but wanted to keep changes manageable.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mean to hold back this PR, just want to understand whether I am missing something. 😄

But overall, it appears that this needs to go under some careful review and clean up. Yes, we should do it incrementally -- otherwise, we are going nowhere. 😆

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this being something we can improve, but that which we should improve in the future.

Comment thread tests/cases/boolean/and.test Outdated
and(null::bool, null::bool) = null::bool
and(true::bool, null::bool?) = null::bool?
and(null::bool?, true::bool) = null::bool?
and(false::bool, null::bool?) = false::bool
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually maybe this output should be nullable, too?

@vbarua
Copy link
Copy Markdown
Member

vbarua commented Mar 6, 2026

Added a potential 3rd option for this in the original issue #988 (comment)

It's a tiny bit obnoxious to have to add ? to every usage of null. At the spec* level, we already say that null literals have to be of nullable type.

* specifically the proto

…and discrete functions

Ensure test case output types respect nullability handling rules:
- MIRROR: output is nullable when any input is nullable
- DECLARED_OUTPUT: output nullability matches declared return type
- DISCRETE: same as DECLARED_OUTPUT for output nullability
Add automated validation that test case return type nullability is
consistent with the function's declared nullability handling (MIRROR,
DECLARED_OUTPUT, DISCRETE). Fix visitor to preserve nullable markers
on bool, string, timestamp, and other types that were previously
hardcoded. Revert 3 incorrect list test case changes from prior commit
where inner element nullability was confused with outer type nullability.
@benbellick benbellick force-pushed the benbellick/nullable-null-literals branch from f8f1b95 to 6534543 Compare March 6, 2026 19:37
Comment thread tests/test_extensions.py
print(json.dumps(asdict(actual_baseline), indent=2))


def test_substrait_nullability_consistency():
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really the core driver of this PR. I realize there are a lot of cases changed, but all of the changes were driven by making this test pass. This test is just checking that the test cases have nullability which is consistent with the nullability handling of the functions as they are defined in the YAML files.

@github-actions
Copy link
Copy Markdown

ACTION NEEDED

Substrait follows the Conventional Commits
specification
for
release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

Comment thread tests/coverage/nodes.py
from typing import List


def type_str_is_outer_nullable(type_str):
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of keeping this diff small, I used this hack. I would love to have a better solution, but I think that is worth its own dedicated PR.

@benbellick benbellick marked this pull request as ready for review March 13, 2026 21:31
@vbarua
Copy link
Copy Markdown
Member

vbarua commented Mar 14, 2026

It's a tiny bit obnoxious to have to add ? to every usage of null. At the spec* level, we already say that null literals have to be of nullable type.

This was a weakly held opinion, and I have been easily convinced the explicit is better than implicit in this case.

Copy link
Copy Markdown
Member

@vbarua vbarua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, got nerdsniped into doing cursed ANTLR things.

Comment thread tests/coverage/visitor.py
if not type_str_is_outer_nullable(datatype):
raise ParseError(
f"Null literal must have a nullable type (add '?'), "
f"got: null::{datatype}"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this will make it easy for other folks to fix this issue when adding new tests.

Comment thread tests/coverage/visitor.py
value=column,
type=column_type,
nullable=type_str_is_outer_nullable(column_type),
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You were asking if you could do something better*. You can technically just reach down through the nodes and look for the QMark() directly.

        column = self.visitColumnValues(ctx.columnValues())
        column_type = ctx.dataType()
        is_nullable = column_type.children[0].children[0].QMark() == '?'

* not sure if this is better tbh, but it was fun to figure out

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking with @/benbellick about this, in some ways this is more fragile than his approach of producing the type string and parsing that. I'm happy to go with his approach, and while there's always room for improvement, I don't think it needs to happen in this PR.

Comment thread tests/coverage/extensions.py Outdated
Comment thread tests/coverage/extensions.py Outdated
Comment thread tests/coverage/nodes.py
bracket_pos = type_str.find("<")
if bracket_pos == -1:
return type_str.endswith("?")
return "?" in type_str[:bracket_pos]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is reasonable enough, and the type string shape can be considered stable.

Copy link
Copy Markdown
Member

@vbarua vbarua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good overall. Left some minor comments, but nothing I consider blocking.

@vbarua vbarua added the PMC Ready PRs ready for review by PMCs label Mar 18, 2026
@benbellick benbellick requested a review from yongchul March 18, 2026 22:05
Address PR review feedback:
- Rename return_nullable to is_return_nullable for better readability
- Add is_return_nullable to FunctionOverload.__str__ method
@benbellick benbellick merged commit ba9b0ff into main Mar 20, 2026
13 checks passed
@benbellick benbellick deleted the benbellick/nullable-null-literals branch March 20, 2026 15:29
bvolpato-dd pushed a commit to bvolpato/substrait that referenced this pull request Apr 14, 2026
…sults

The coverage checker's `is_same_type` compared only the base type name
and stripped all parameters, so decimal precision/scale, varchar length,
and list/map element types were never actually validated against the
extension YAML return formulas. Two wrong test cases had been sitting
in-tree because of this:

- `sum((2.5, 0, 5.0, -2.5, -7.5)::dec<2, 1>) = -2.5::dec<38, 2>` in
  `tests/cases/arithmetic_decimal/sum_decimal.test`. `sum` returns
  `DECIMAL?<38,S>`, so with input scale 1 the output must be
  `dec?<38, 1>`, not `dec?<38, 2>`.
- `nullif(null::dec?<38, 0>, null::dec?<38, 0>) = null::bool?` in
  `tests/cases/comparison/nullif.test`. `nullif` is `any1, any1 -> any1?`,
  so with both args `dec<38, 0>` the result must be `dec?<38, 0>`, not
  `bool?`. Looks like a copy-paste from the bool cases above.

Both wrong results date back to the original BFT port (substrait-io#738). They are
not undoing prior work: substrait-io#913 only touched the `basic` i8/i16 block at
the top of `nullif.test`, and substrait-io#989 rewrote both lines only to add `?`
nullability markers, preserving the underlying wrong `38,2` and wrong
`bool` base type. This change keeps every `?` marker from substrait-io#989 and
only fixes the parameter substrait-io#989 wasn't looking at.

To prevent regressions, this adds `tests/coverage/type_checker.py` — a
symbolic unifier plus evaluator for the YAML return-formula mini-language
(assignments, `min`/`max`, `cond ? a : b` ternary). The new module:

- parses type strings like `decimal<P1,S1>`, `list<any2>`, `STRUCT<...>`,
  `func<any1 -> boolean?>` into tagged tuples;
- unifies an impl-declared type against a concrete test type, binding
  variables like `P1`, `S1`, and polymorphic `any1`/`any2`;
- evaluates multi-line return formulas (add/sub/mul/div/mod, min/max,
  sum, `any1?`, etc.) with the bindings and compares structurally to
  the test's declared result type.

`FunctionOverload`/`FunctionVariant` now carry the raw YAML arg types
and return formula alongside the existing short-form fingerprint.
`FunctionRegistry.get_function` runs the strict check after the legacy
loose match, so the wider test suite's base-type fallback still applies
when the caller hasn't supplied full parameterized types. When the
formula cannot be evaluated (unbound variable, unusual syntax), the
strict check falls back to success, preserving compatibility with the
current extensions and leaving room to tighten further.

`test_type_checker.py` covers parsing, unification, formula evaluation
for add/divide, the two specific bugs above, and the tolerant behavior
for tests that omit optional decimal parameters (e.g. `power(dec, dec<38,0>)`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PMC Ready PRs ready for review by PMCs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants