Skip to content

docs: add example BigQuery E2E#1072

Open
geoHeil wants to merge 1 commit intomainfrom
feat/example-bigquery
Open

docs: add example BigQuery E2E#1072
geoHeil wants to merge 1 commit intomainfrom
feat/example-bigquery

Conversation

@geoHeil
Copy link
Copy Markdown
Collaborator

@geoHeil geoHeil commented Mar 18, 2026

Summary

  • add docs example for how to use bigquery with alembic
  • add workaround example for smooth DDL generation

Changelog

  • add docs example for how to use bigquery with alembic

Test Plan

  • tested with my own BQ account and the included tests

Copy link
Copy Markdown
Collaborator Author

geoHeil commented Mar 18, 2026


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • graphite:merge - adds this PR to the back of the merge queue
  • graphite:hotfix - for urgent changes, fast-track this PR to the front of the merge queue

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

Comment thread docs/examples/bigquery.md Outdated
@geoHeil geoHeil force-pushed the feat/example-bigquery branch 3 times, most recently from 6de22cd to db132eb Compare March 18, 2026 10:29
@geoHeil geoHeil changed the title feat: add example BigQuery E2E docs: add example BigQuery E2E Mar 18, 2026
@geoHeil geoHeil force-pushed the feat/example-bigquery branch 2 times, most recently from 1854508 to 7066e38 Compare March 18, 2026 12:21
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

@geoHeil geoHeil force-pushed the feat/example-bigquery branch 2 times, most recently from 9895258 to 6ddab6c Compare March 18, 2026 13:18
Comment on lines +46 to +47
BIGQUERY_TEST_PROJECT_ID: ${{ secrets.BIGQUERY_TEST_PROJECT_ID }}
BIGQUERY_TEST_CREDENTIALS_JSON: ${{ secrets.BIGQUERY_TEST_CREDENTIALS_JSON }}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some BQ creds for the CI

Comment thread examples/example-bigquery/metaxy.toml Outdated
@geoHeil geoHeil force-pushed the feat/example-bigquery branch 5 times, most recently from ff4df4a to f6259ff Compare March 18, 2026 14:17
@geoHeil geoHeil force-pushed the feat/example-bigquery branch 6 times, most recently from b4b2bf4 to 852176a Compare March 18, 2026 16:24
@geoHeil
Copy link
Copy Markdown
Collaborator Author

geoHeil commented Mar 18, 2026

Both table creation modes work end-to-end against my own BQ project.

Mode How Result
auto_create_tables Ibis creates tables on first write Works
Alembic migrations alembic revision --autogenerate + upgrade head Works

What this PR adds

Example (examples/example-bigquery/):

  • SQLModel features (BaseSQLModelFeature with table=True)
  • Pipeline with resolve_update + write
  • Dual-track Alembic setup (system tables + feature tables)
  • env.py workarounds for BigQuery's OLAP limitations (see below)

Core (src/metaxy/ext/metadata_stores/bigquery.py):

  • sqlalchemy_url property returning bigquery://project/dataset for Alembic

CI (tests/metadata_stores/):

  • BigQuery integration test fixtures (BIGQUERY_TEST_PROJECT_ID, BIGQUERY_TEST_CREDENTIALS_PATH)
  • Two live integration tests (write+read, idempotent resolve_update)
  • Secrets wired through all CI workflows

BigQuery Alembic workarounds in env.py

Three issues with sqlalchemy-bigquery + Alembic required workarounds in the example's env.py files:

Issue Workaround
CREATE INDEX not supported include_object callback skips index/constraint generation
sa.JSON()JSON column but Ibis writes STRUCT _patch_bigquery_types() rewrites JSON→STRUCT using feature field names
DateTime(timezone=True)DATETIME but Ibis writes TIMESTAMP Same helper rewrites to sa.TIMESTAMP
Reflection crash on system tables include_name scopes reflection to feature tables only

These workarounds live entirely in the example, not in Metaxy core. Upstream fix for sa.JSON() compilation submitted: googleapis/google-cloud-python#16124

@geoHeil geoHeil force-pushed the feat/example-bigquery branch 3 times, most recently from 0a0d0f9 to 1f6c23e Compare March 18, 2026 16:52
@geoHeil
Copy link
Copy Markdown
Collaborator Author

geoHeil commented Mar 20, 2026

cleaned up as discussed - secrets are needed now.

@geoHeil geoHeil force-pushed the feat/example-bigquery branch 4 times, most recently from 049099f to b65a7fc Compare March 30, 2026 18:32
@geoHeil geoHeil requested a review from Copilot March 30, 2026 18:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds BigQuery “native” integration testing support and tightens BigQuery store configuration/initialization behavior, along with CI wiring to run BigQuery tests when credentials are available.

Changes:

  • Introduces BigQuery integration test fixtures and a shared test-suite-based BigQuery test class.
  • Makes dataset_id required (config + runtime validation) and adds a sqlalchemy_url convenience property.
  • Updates CI workflows and pytest markers to support gated BigQuery live tests.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/ext/bigquery/test_native.py Adds shared integration test suite coverage for BigQuery; refactors/extends existing unit tests.
tests/ext/bigquery/conftest.py Adds environment-gated BigQuery fixtures to create/delete ephemeral datasets.
src/metaxy/ext/metadata_stores/bigquery.py Requires dataset_id, tightens init validation, adds sqlalchemy_url, adjusts docs and display output.
pyproject.toml Adds bigquery pytest marker description.
devenv.lock Updates Nix inputs/lock entries (git-hooks, flake-compat, etc.).
.github/workflows/_qa.yml Adds optional BQ secrets and writes SA JSON to a temp credentials file for tests.
.github/workflows/main.yml Passes BigQuery secrets through to the reusable QA workflow.
.github/workflows/release.yml Passes BigQuery secrets through to the reusable QA workflow.
.github/workflows/Pull Request.yml Passes BigQuery secrets through to the reusable QA workflow.
Comments suppressed due to low confidence (1)

src/metaxy/ext/metadata_stores/bigquery.py:133

  • When connection_params is provided alongside explicit project_id/dataset_id, self.project_id/self.dataset_id are derived from the explicit args (due to the or), but the actual backend connection uses the unmodified connection_params. If they differ, display()/sqlalchemy_url can report a different project/dataset than the one actually used for connections. Fix by either (a) enforcing consistency (raise ValueError if both are provided and mismatch) or (b) overriding connection_params['project_id'] / connection_params['dataset_id'] with the resolved values before calling super().__init__.
        super().__init__(
            backend="bigquery",
            connection_params=connection_params,
            fallback_stores=fallback_stores,
            **kwargs,
        )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/metaxy/ext/metadata_stores/bigquery.py Outdated
Comment thread src/metaxy/ext/metadata_stores/bigquery.py Outdated
Comment thread src/metaxy/ext/metadata_stores/bigquery.py Outdated
Comment thread src/metaxy/ext/metadata_stores/bigquery.py Outdated
Comment thread src/metaxy/ext/metadata_stores/bigquery.py Outdated
Comment thread src/metaxy/ext/metadata_stores/bigquery.py Outdated
@geoHeil geoHeil force-pushed the feat/example-bigquery branch from b65a7fc to 633beec Compare March 30, 2026 19:31
@geoHeil geoHeil requested a review from Copilot March 30, 2026 19:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/metaxy/ext/metadata_stores/bigquery.py Outdated
Comment thread tests/ext/bigquery/conftest.py Outdated
Comment thread tests/ext/bigquery/conftest.py Outdated
Comment thread .github/workflows/_qa.yml
Comment thread src/metaxy/ext/metadata_stores/bigquery.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/metaxy/ext/metadata_stores/bigquery.py Outdated
Comment thread tests/ext/bigquery/conftest.py
Comment thread src/metaxy/ext/metadata_stores/bigquery.py Outdated
@geoHeil geoHeil force-pushed the feat/example-bigquery branch 2 times, most recently from d592795 to 1387fb6 Compare March 30, 2026 22:08
@geoHeil geoHeil changed the base branch from main to graphite-base/1072 April 4, 2026 15:49
@geoHeil geoHeil force-pushed the feat/example-bigquery branch from 1387fb6 to c65bb76 Compare April 4, 2026 15:49
@geoHeil geoHeil changed the base branch from graphite-base/1072 to 04-04-fix_update_devenv April 4, 2026 15:49
@geoHeil geoHeil force-pushed the feat/example-bigquery branch from c65bb76 to 6933f01 Compare April 5, 2026 07:50
@graphite-app graphite-app Bot changed the base branch from 04-04-fix_update_devenv to graphite-base/1072 April 8, 2026 10:32
@graphite-app graphite-app Bot force-pushed the feat/example-bigquery branch from 6933f01 to bcc7220 Compare April 8, 2026 10:39
@graphite-app graphite-app Bot force-pushed the graphite-base/1072 branch from 42491d9 to 77b70bf Compare April 8, 2026 10:39
@graphite-app graphite-app Bot changed the base branch from graphite-base/1072 to main April 8, 2026 10:39
@graphite-app graphite-app Bot force-pushed the feat/example-bigquery branch from bcc7220 to 43bc777 Compare April 8, 2026 10:39
Production-grade BigQuery metadata store example with SQLModel
features and dual-track Alembic migrations (system + feature tables).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@geoHeil geoHeil force-pushed the feat/example-bigquery branch from 43bc777 to 3f037cf Compare April 10, 2026 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants