Releases · getyourguide/dataframe-expectations

18 Mar 14:19

ryanseq-gyg

v0.6.0

0fb3d21

v0.6.0 Latest

Latest

0.6.0 (2026-03-18)

Features

PySpark is now an optional dependency (27e864b)

Users running PySpark in managed environments (Databricks, EMR, etc.) typically have PySpark
pre-installed and cannot or do not want the library to reinstall it. PySpark is now optional
and must be explicitly requested:
```
pip install dataframe-expectations           # pandas only
pip install dataframe-expectations[pyspark]  # includes pyspark
```
pandas, pydantic, and tabulate remain hard dependencies. Importing dataframe_expectations
no longer touches PySpark at all when it isn't installed — all PySpark imports are deferred
behind @lru_cache helpers that return a proxy raising a clear ImportError only when a
PySpark code path is actually executed.
PySpark tests isolated by marker (ef847ed)

All PySpark test cases are decorated with @pytest.mark.pyspark and separated into their own
parametrize blocks. --strict-markers is now enforced so unregistered markers cause an
immediate failure. Tests can be run without PySpark present:
```
pytest -m "not pyspark"   # no PySpark required
pytest -m pyspark          # requires PySpark
```

CI updated to cover three install scenarios

Job	How PySpark is present	Tests run
`tests-without-pyspark`	Not installed	`-m "not pyspark"`
`tests-with-pyspark-extra`	`pip install .[pyspark]`	All
`tests-with-external-pyspark`	Pre-installed externally	All

The tests-with-external-pyspark job specifically validates that the library works correctly
when PySpark is already present in the environment and was not installed by this package.

Assets 2

16 Mar 15:05

ryanseq-gyg

v0.5.2

33a969e

v0.5.2

What's Changed

Improvements Fixes

PySpark is now treated as an optional dependency at runtime. Users who only need pandas DataFrame validation will no longer see import errors if PySpark is not installed. (15d38ca)

Other Changes

Added workflow_dispatch trigger to the release-please workflow for manual triggering via GitHub Actions UI.
Dependency updates: ruff bumped to 0.15.6, tabulate bumped to 0.10.0.

Full Changelog: v0.5.1...v0.5.2

Assets 2

28 Jan 11:38

github-actions

v0.5.1

639061b

v0.5.1

0.5.1 (2026-01-28)

Features

adding new numeric expectations (2b07c7a)
adding new numeric expectations (90f8cb7)

Documentation

improved the API docs website (df9f7b1)
improved the API docs website (966ea5a)
minor corrections to readme (15fa72d)
minor corrections to readme (3358d23)
partitioned readme (8cf59b1)
partitioned readme (30500e7)

Assets 2

22 Nov 14:11

github-actions

v0.5.0

d62ce0b

v0.5.0

Features

Tag-based Filtering

Add support for selective expectation execution using custom tags.

Key Features:

New TagMatchMode enum with ANY (OR logic) and ALL (AND logic) options
Tag expectations with "key:value" format (e.g., "priority:high", "env:prod")
Filter expectations at build time

Example:

# Tag expectations
suite = (
    DataFrameExpectationsSuite()
    .expect_value_greater_than(column_name="age", value=18, tags=["priority:high", "env:prod"])
    .expect_value_not_null(column_name="name", tags=["priority:high"])
    .expect_min_rows(min_rows=1, tags=["priority:low", "env:test"])
)

# Run only high-priority checks (OR logic)
runner = suite.build(tags=["priority:high"], tag_match_mode=TagMatchMode.ANY)

# Run production-critical checks (AND logic)
runner = suite.build(tags=["priority:high", "env:prod"], tag_match_mode=TagMatchMode.ALL)

Programmatic Result Inspection

Enhanced SuiteExecutionResult for detailed validation analysis.

Key Features:

Use raise_on_failure=False to inspect results without raising exceptions
Access comprehensive metrics: total_expectations, total_passed, total_failed, pass_rate, total_duration_seconds
Inspect individual expectation results with status, violation counts, descriptions, and timing
View applied tag filters in execution results

Example:

# Get results without raising exceptions
result = runner.run(df, raise_on_failure=False)

# Inspect the results programmatically
print(f"Total expectations: {result.total_expectations}")
print(f"Passed: {result.total_passed}, Failed: {result.total_failed}")
print(f"Pass rate: {result.pass_rate:.2%}")
print(f"Applied filters: {result.applied_filters}")
print(f"Tag match mode: {result.tag_match_mode}")

# Access individual expectation results
for exp_result in result.results:
    if exp_result.status == "failed":
        print(f"Failed: {exp_result.description}")
        print(f"Violation count: {exp_result.violation_count}")

Documentation

Added tag-based filtering examples to README.md and getting_started.rst
Updated adding_expectations.rst with proper tag handling patterns for custom expectations
Documented programmatic result inspection with comprehensive examples
Reorganized documentation structure: user guide in getting_started.rst, developer notes in adding_expectations.rst

Full Changelog: v0.4.0...v0.5.0

Assets 2

10 Nov 16:19

github-actions

v0.4.0

a110faf

v0.4.0

0.4.0 (2025-11-10)

⚠ BREAKING CHANGES

‼️ BREAKING CHANGE: Major codebase restructuring with new module organization. However, most changes are made to the internal modules.

What changed:

All internal modules have been reorganized into a core/ package
Expectation registry simplified from three-dictionary to two-dictionary structure with O(1) lookups
Main imports updated from expectations_suite to suite

Migration guide:
Update your imports to use the new module structure:

# Before
from dataframe_expectations.expectations_suite import DataFrameExpectationsSuite

# After
from dataframe_expectations.suite import DataFrameExpectationsSuite

Features

restructure codebase with core/ module and explicit imports (42a233a)
restructure codebase, and registry refactoring (111bca1)
simplified registry (c182858)

Bug Fixes

consolidate imports (9a76467)
deleted duplicate dataclass and enums from registry (82bec0c)
deleted duplicate DataFrameExpectation codefrom expectations package (d47eb8b)
import enums from types (fa84764)
manually trigger CI for release-please PRs (49419e6)
manually trigger CI for release-please PRs (9585cf5)
return corrent version when package is built (82ff343)

Documentation

remove unused imports (276589d)

Full Changelog: v0.3.0...v0.4.0

Assets 2

09 Nov 12:43

github-actions

v0.3.0

5567760

v0.3.0

🎯 DataFrame Expectations v0.3.0

⚠️ Breaking Changes

This release introduces a builder pattern for the DataFrameExpectationsSuite that changes how you create and run expectation suites.

Migration Guide:

# Before (v0.2.0)
suite = DataFrameExpectationsSuite()
suite.expect_min_rows(min_rows=3)
suite.run(df)

# After (v0.3.0)
suite = DataFrameExpectationsSuite()
suite.expect_min_rows(min_rows=3)
runner = suite.build()  # New: Build a runner
runner.run(df)          # Run on the runner

✨ New Features

🏗️ Builder Pattern & Immutable Runners

Introduces DataFrameExpectationsSuiteRunner - an immutable runner created via .build()
Allows reusing the same validation logic across multiple DataFrames
Enables building multiple independent runners from the same suite at different stages

🎨 Decorator Pattern for Automatic Validation

Validate DataFrames returned by functions automatically using the @runner.validate decorator:

@runner.validate
def load_data():
    return pd.DataFrame({"col": [1, 2, 3]})

# Supports optional DataFrame returns
@runner.validate(allow_none=True)
def maybe_load_data():
    if condition:
        return pd.DataFrame(...)
    return None

🔍 Expectation Inspection

Added expectation_count property to check the number of expectations
Added list_expectations() method to view all expectations in a runner

📚 Documentation

Added Spark session initialization to PySpark examples in README and documentation
Improved example code to be immediately runnable

🔧 Maintenance

Updated release configuration for simpler tag generation
Dependency updates: pytest 9.0.0, ruff 0.14.4, pre-commit 4.4.0

📦 What's Changed

fix: update release please config to generate simple tags by @ryanseq-gyg in #13
feat!: implement builder pattern for expectation suite runner by @ryanseq-gyg in #18
build(deps): bump pre-commit from 4.3.0 to 4.4.0 by @dependabot in #17
build(deps): bump ruff from 0.14.3 to 0.14.4 by @dependabot in #16
build(deps): bump pytest from 8.4.2 to 9.0.0 in the 01_major-updates group by @dependabot in #15

Full Changelog: v0.2.0...v0.3.0

Contributors

dependabot and ryanseq-gyg

Assets 2

08 Nov 20:16

github-actions

v0.2.0

0170ac5

v0.2.0

This release introduces a major refactoring of the expectation registration system, replacing 800+ lines of boilerplate with dynamic method generation from a central registry. The refactoring maintains full IDE type-ahead support through auto-generated stub files while significantly improving maintainability.

Features

Dynamic Expectation Registration: Implement dynamic method generation with centralized registry system
- Replaces manual method definitions in DataFrameExpectationsSuite
- Maintains IDE type hints through auto-generated .pyi stub files
- Reduces boilerplate and improves maintainability

Bug Fixes

Handle pandas DataFrame.map() compatibility for older versions
Convert expectation category to str while generating stubs

Documentation

Update documentation for new registration system
Remove API reference button on expectation cards
Update README with additional badges

Chores

Add publishing and release workflows
Pin action commit hashes and update PR template
Update sanity checks script for dynamic expectation calls
Update release-please to approved version

What's Changed

fix: updated release-please hash to approve version by @ryanseq-gyg in #6
fix: added more badges to readme by @ryanseq-gyg in #8
Pin uv to commit by @ryanseq-gyg in #9
Bump ruff from 0.14.2 to 0.14.3 by @dependabot in #10
Refactor expectation registration by @ryanseq-gyg in #12
chore(main): release dataframe-expectations 0.2.0 by @github-actions in #7

Full Changelog: v0.1.1...dataframe-expectations-v0.2.0

Contributors

dependabot and ryanseq-gyg

Assets 2

31 Oct 15:25

ryanseq-gyg

v0.1.1

3f89e95

v0.1.1

What's Changed

Initial commits by @ryanseq-gyg in #1
[dependabutler] update .github/dependabot.yml by @gygsecrobot in #2
Bump ruff from 0.14.1 to 0.14.2 by @dependabot[bot] in #3
Updated runners in CI by @ryanseq-gyg in #4
fix: added publishing and release workflows by @ryanseq-gyg in #5

New Contributors

@ryanseq-gyg made their first contribution in #1
@gygsecrobot made their first contribution in #2
@dependabot[bot] made their first contribution in #3

Full Changelog: https://github.com/getyourguide/dataframe-expectations/commits/v0.1.1

Contributors

dependabot, ryanseq-gyg, and gygsecrobot

Assets 2

Releases: getyourguide/dataframe-expectations

v0.6.0

0.6.0 (2026-03-18)

Features

Uh oh!

v0.5.2

What's Changed

Improvements Fixes

Other Changes

Uh oh!

v0.5.1

0.5.1 (2026-01-28)

Features

Documentation

Uh oh!

v0.5.0

Features

Tag-based Filtering

Programmatic Result Inspection

Documentation

Uh oh!

v0.4.0

0.4.0 (2025-11-10)

⚠ BREAKING CHANGES

Features

Bug Fixes

Documentation

Uh oh!

v0.3.0

🎯 DataFrame Expectations v0.3.0

⚠️ Breaking Changes

✨ New Features

🏗️ Builder Pattern & Immutable Runners

🎨 Decorator Pattern for Automatic Validation

🔍 Expectation Inspection

📚 Documentation

🔧 Maintenance

📦 What's Changed

Contributors

Uh oh!

v0.2.0

Features

Bug Fixes

Documentation

Chores

What's Changed

Contributors

Uh oh!

v0.1.1

What's Changed

New Contributors

Contributors

Uh oh!