Skip to content

Better handling for which producers can correctly encode which queries #9

@richtia

Description

@richtia

I think there are some bits here that are being handled manually that should be handled in a more automated fashion.

With the hand-written tuples for which producer can correctly encode which query, we have the means to track regressions (DuckDB suddenly fails to run logb, e.g.) but not improvements (Isthmus can now encode logb).

This is a highly non-trivial problem, because the outcomes of the tests of the producers are essentially the text fixtures for the consumers.

We've been using pytest-snapshot to test that Ibis produces "good" or "golden" SQL for various expressions (https://pypi.org/project/pytest-snapshot/) and I wonder if that would be of help here.

Testing producers would mean generating substrait blobs, then comparing them to known good / valid snapshots of those blobs.

Testing consumers would consist of loading the snapshots blobs and attempting to execute.

I know I'm not covering everything that needs covering in the test matrix here, but I think it would be a very good idea to start sketching out more sustainable patterns.

Having said all of ^^^^that^^^^, I don't think that should block this PR.

I do think that we should be attempting to run all producer tests on all SQL snippets, and not manually filtering them down pre-test. If isthmus is going to fail one of those tests because it uses a different SQL dialect, so be it -- we can get creative in the xfail markers and distinguish between "tests that fail that should pass in the future" and "tests that fail that will always fail".

Alternatively, we might make use of sqlglot to translate string sql between dialects -- it's very good at that.

Originally posted by @gforsyth in #6 (review)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions