fix(py): custom reader tests for new execution pipeline#131
Merged
cpsievert merged 2 commits intoposit-dev:mainfrom Feb 18, 2026
Merged
fix(py): custom reader tests for new execution pipeline#131cpsievert merged 2 commits intoposit-dev:mainfrom
cpsievert merged 2 commits intoposit-dev:mainfrom
Conversation
The "New Scale syntax and implementation" PR (posit-dev#82, 7a5ed62) introduced a column renaming pipeline where the execution engine generates SQL like `SELECT "x" AS "__ggsql_aes_x__"` and passes it to readers. This requires custom readers to actually execute the SQL they receive, since the renamed columns are expected downstream during pruning. The custom reader tests were written with static readers that returned hardcoded DataFrames, ignoring the SQL parameter entirely. This worked with the old pipeline but fails with the new one because the returned DataFrames lack the `__ggsql_aes_*` prefixed columns. These failures were never caught on main because the Python CI workflow only triggers on changes to `ggsql-python/**`, and posit-dev#82 only changed `src/` files. Changes: - Update 4 custom reader tests to use in-memory DuckDB connections that properly execute the SQL they receive - Add duckdb and pyarrow as test dependencies - Fix CI workflow to install the locally-built wheel instead of downloading a stale version from PyPI (pip install with glob instead of --find-links) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The glob pattern in `pip install target/wheels/ggsql-*.whl'[test]'` was not expanding because the shell treated `[test]` as a character class, preventing any file from matching. Use `ls` to expand the glob into a variable first. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collaborator
Author
|
Merging this without review since the only changes are Python tests and It fixes a problem already on main (several other PRs want this change to get free check marks). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR fixes 4 failing Python tests (for custom readers) that broke after #82 (7a5ed62).
These failures weren't caught at the time since the Python tests only run when there are changes to the Python package.
I'd be open to running the Python tests on every code change, but I'm not sure that's noise that the rest of the team wants to adopt?
Context
PR #82 ("New Scale syntax and implementation") introduced a column renaming pipeline where the execution engine generates SQL like:
This SQL is passed to readers via
execute_sql(), and the returned DataFrame is expected to contain the prefixed__ggsql_aes_*columns for downstream processing (pruning, scale resolution, etc.).The custom reader tests were written with static readers that returned hardcoded DataFrames, ignoring the
sqlparameter:This worked with the old pipeline (which didn't rename columns via SQL), but the new pipeline expects the reader to execute the SQL it receives.
These failures were never caught on
mainbecause the Python CI workflow (python.yml) only triggers on changes toggsql-python/**, and #82 only modifiedsrc/andtree-sitter-ggsql/files.Changes
Test fixes: Updated 4 custom reader tests to use in-memory
duckdb.connect()instances that actually execute the SQL they receive. This is more representative of real custom readers (like the existingIbisReadertest) and properly exercises the bridge path.Test dependencies: Added
duckdbandpyarrowto both[project.optional-dependencies] testand[dependency-groups] dev. Thepyarrowdependency is needed forduckdb's.pl()method to convert results to Polars DataFrames.CI workflow fix: Changed
pip install --find-links target/wheels/ ggsql[test]topip install target/wheels/ggsql-*.whl'[test]'. This ensures that the locally built wheel will be installed and used, even if there is some more recent version available somewhere else.🤖 Generated with Claude Code