Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .circleci/wait-for-db.sh
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,14 @@ risingwave_ready() {
probe_port 4566
}

gizmosql_ready() {
# GizmoSQL uses port 31337 for Flight SQL connections
# Also check that the server has fully started by looking for the startup message
probe_port 31337
# Give it a few more seconds for the server to initialize after port is available
sleep 3
}

echo "Waiting for $ENGINE to be ready..."

READINESS_FUNC="${ENGINE}_ready"
Expand Down
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,9 @@ trino-test: engine-trino-up
risingwave-test: engine-risingwave-up
pytest -n auto -m "risingwave" --reruns 3 --junitxml=test-results/junit-risingwave.xml

gizmosql-test: engine-gizmosql-up
pytest -n auto -m "gizmosql" --reruns 3 --junitxml=test-results/junit-gizmosql.xml

#################
# Cloud Engines #
#################
Expand Down
144 changes: 144 additions & 0 deletions docs/integrations/engines/gizmosql.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# GizmoSQL

This page provides information about how to use SQLMesh with the [GizmoSQL](https://github.com/gizmodata/gizmosql) database server.

!!! info
The GizmoSQL engine adapter is a community contribution. Due to this, only limited community support is available.

## Overview

GizmoSQL is a database server that uses [DuckDB](./duckdb.md) as its execution engine and exposes an [Apache Arrow Flight SQL](https://arrow.apache.org/docs/format/FlightSql.html) interface for remote connections. This allows you to connect to a GizmoSQL server from anywhere on your network while still benefiting from DuckDB's fast analytical query processing.

The SQLMesh GizmoSQL adapter uses [ADBC (Arrow Database Connectivity)](https://arrow.apache.org/docs/format/ADBC.html) with the Flight SQL driver to communicate with GizmoSQL servers. Data is transferred using the efficient Apache Arrow columnar format.

!!! note
This adapter only supports the DuckDB backend for GizmoSQL. Attempting to connect to a GizmoSQL server running a different backend will result in an error.

## Local/Built-in Scheduler

**Engine Adapter Type**: `gizmosql`

### Installation

```
pip install "sqlmesh[gizmosql]"
```

This will install the required dependencies:

- `adbc-driver-flightsql` - The ADBC driver for Arrow Flight SQL
- `pyarrow` - Apache Arrow Python bindings

## Connection options

| Option | Description | Type | Required |
|------------------------------------|-------------------------------------------------------------------------------|:-------:|:--------:|
| `type` | Engine type name - must be `gizmosql` | string | Y |
| `host` | The hostname of the GizmoSQL server | string | N |
| `port` | The port number of the GizmoSQL server (default: `31337`) | int | N |
| `username` | The username for authentication with the GizmoSQL server | string | Y |
| `password` | The password for authentication with the GizmoSQL server | string | Y |
| `use_encryption` | Whether to use TLS encryption for the connection (default: `true`) | bool | N |
| `disable_certificate_verification`| Skip TLS certificate verification - useful for self-signed certs (default: `false`) | bool | N |
| `database` | The default database/catalog to use | string | N |

### Example configuration

=== "YAML"

```yaml linenums="1"
gateways:
gizmosql:
connection:
type: gizmosql
host: gizmosql.example.com
port: 31337
username: my_user
password: my_password
use_encryption: true
disable_certificate_verification: false
```

=== "Python"

```python linenums="1"
from sqlmesh.core.config import (
Config,
GatewayConfig,
ModelDefaultsConfig,
)
from sqlmesh.core.config.connection import GizmoSQLConnectionConfig

config = Config(
model_defaults=ModelDefaultsConfig(dialect="duckdb"),
gateways={
"gizmosql": GatewayConfig(
connection=GizmoSQLConnectionConfig(
host="gizmosql.example.com",
port=31337,
username="my_user",
password="my_password",
use_encryption=True,
disable_certificate_verification=False,
),
),
},
)
```

## SQL Dialect

GizmoSQL uses the DuckDB SQL dialect. When writing models for GizmoSQL, set your model dialect to `duckdb`:

```yaml
model_defaults:
dialect: duckdb
```

Or specify the dialect in individual model definitions:

```sql
MODEL (
name my_schema.my_model,
dialect duckdb
);

SELECT * FROM my_table;
```

## Docker Setup

For local development and testing, you can run GizmoSQL using Docker:

```bash
docker run -d \
--name gizmosql \
-p 31337:31337 \
-e GIZMOSQL_USERNAME=gizmosql_user \
-e GIZMOSQL_PASSWORD=gizmosql_password \
-e TLS_ENABLED=1 \
gizmodata/gizmosql:latest
```

Then connect with:

```yaml
gateways:
gizmosql:
connection:
type: gizmosql
host: localhost
port: 31337
username: gizmosql_user
password: gizmosql_password
use_encryption: true
disable_certificate_verification: true # For self-signed certs
```

## Related Integrations

GizmoSQL has adapters available for other popular data tools:

- [Ibis GizmoSQL](https://pypi.org/project/ibis-gizmosql/) - Ibis backend for GizmoSQL
- [dbt-gizmosql](https://pypi.org/search/?q=dbt-gizmosql) - dbt adapter for GizmoSQL
- [SQLFrame GizmoSQL](https://github.com/gizmodata/sqlframe) - SQLFrame (PySpark-like API) support for GizmoSQL
1 change: 1 addition & 0 deletions docs/integrations/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ SQLMesh supports the following execution engines for running SQLMesh projects (e
* [MySQL](./engines/mysql.md) (mysql)
* [Postgres](./engines/postgres.md) (postgres)
* [GCP Postgres](./engines/gcp-postgres.md) (gcppostgres)
* [GizmoSQL](./engines/gizmosql.md) (gizmosql)
* [Redshift](./engines/redshift.md) (redshift)
* [Snowflake](./engines/snowflake.md) (snowflake)
* [Spark](./engines/spark.md) (spark)
Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ nav:
- integrations/engines/mysql.md
- integrations/engines/postgres.md
- integrations/engines/gcp-postgres.md
- integrations/engines/gizmosql.md
- integrations/engines/redshift.md
- integrations/engines/risingwave.md
- integrations/engines/snowflake.md
Expand Down
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,7 @@ lsp = [
"lsprotocol",
]
risingwave = ["psycopg2"]
gizmosql = ["adbc-driver-flightsql", "pyarrow"]

[project.scripts]
sqlmesh = "sqlmesh.cli.main:cli"
Expand Down Expand Up @@ -271,6 +272,7 @@ markers = [
"pyspark: test for PySpark that need to run separately from the other spark tests",
"trino: test for Trino (all connectors)",
"risingwave: test for Risingwave",
"gizmosql: test for GizmoSQL",

# Other
"set_default_connection",
Expand Down
98 changes: 98 additions & 0 deletions sqlmesh/core/config/connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -2326,6 +2326,104 @@ def init(cursor: t.Any) -> None:
return init


class GizmoSQLConnectionConfig(ConnectionConfig):
"""
GizmoSQL connection configuration.

GizmoSQL is a database server that uses DuckDB as its execution engine and
exposes an Arrow Flight SQL interface for remote connections. This configuration
uses ADBC (Arrow Database Connectivity) with the Flight SQL driver.

Args:
host: The hostname of the GizmoSQL server.
port: The port of the GizmoSQL server (default: 31337).
username: The username for authentication.
password: The password for authentication.
use_encryption: Whether to use TLS encryption (default: True).
disable_certificate_verification: Whether to skip TLS certificate verification.
Useful for self-signed certificates in development (default: False).
database: The default database/catalog to use.
concurrent_tasks: The maximum number of concurrent tasks.
register_comments: Whether to register model comments.
pre_ping: Whether to pre-ping the connection.
"""

host: str = "localhost"
port: int = 31337
username: str
password: str
use_encryption: bool = True
disable_certificate_verification: bool = False
database: t.Optional[str] = None

concurrent_tasks: int = 4
register_comments: bool = True
pre_ping: bool = False

type_: t.Literal["gizmosql"] = Field(alias="type", default="gizmosql")
DIALECT: t.ClassVar[t.Literal["duckdb"]] = "duckdb"
DISPLAY_NAME: t.ClassVar[t.Literal["GizmoSQL"]] = "GizmoSQL"
DISPLAY_ORDER: t.ClassVar[t.Literal[17]] = 17

_engine_import_validator = _get_engine_import_validator(
"adbc_driver_flightsql", "gizmosql", extra_name="gizmosql"
)

@property
def _connection_kwargs_keys(self) -> t.Set[str]:
# ADBC uses a different connection pattern, so we don't pass these directly
return set()

@property
def _engine_adapter(self) -> t.Type[EngineAdapter]:
return engine_adapter.GizmoSQLEngineAdapter

@property
def _connection_factory(self) -> t.Callable:
"""
Create a connection factory for GizmoSQL using ADBC Flight SQL driver.

The connection is established using the Arrow Flight SQL protocol over gRPC.
"""
import re
from adbc_driver_flightsql import dbapi as flightsql, DatabaseOptions

def connect() -> t.Any:
# Build the URI for the Flight SQL connection
protocol = "grpc+tls" if self.use_encryption else "grpc"
uri = f"{protocol}://{self.host}:{self.port}"

# ADBC database-level options (passed to the driver)
db_kwargs: t.Dict[str, str] = {
"username": self.username,
"password": self.password,
}

# Add TLS skip verify option using the proper DatabaseOptions enum
if self.use_encryption and self.disable_certificate_verification:
db_kwargs[DatabaseOptions.TLS_SKIP_VERIFY.value] = "true"

# Create the connection - uri is first positional arg, db_kwargs is for driver options
# Explicit autocommit=True since GizmoSQL doesn't support manual transaction commits
conn = flightsql.connect(uri, db_kwargs=db_kwargs, autocommit=True)

# Verify the backend is DuckDB - this adapter only supports the DuckDB backend
vendor_version = conn.adbc_get_info().get("vendor_version", "")
if not re.search(pattern=r"^duckdb ", string=vendor_version):
conn.close()
raise ConfigError(
f"Unsupported GizmoSQL server backend: '{vendor_version}'. "
"This adapter only supports the DuckDB backend for GizmoSQL."
)

return conn

return connect

def get_catalog(self) -> t.Optional[str]:
return self.database


CONNECTION_CONFIG_TO_TYPE = {
# Map all subclasses of ConnectionConfig to the value of their `type_` field.
tpe.all_field_infos()["type_"].default: tpe
Expand Down
2 changes: 2 additions & 0 deletions sqlmesh/core/engine_adapter/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from sqlmesh.core.engine_adapter.athena import AthenaEngineAdapter
from sqlmesh.core.engine_adapter.risingwave import RisingwaveEngineAdapter
from sqlmesh.core.engine_adapter.fabric import FabricEngineAdapter
from sqlmesh.core.engine_adapter.gizmosql import GizmoSQLEngineAdapter

DIALECT_TO_ENGINE_ADAPTER = {
"hive": SparkEngineAdapter,
Expand All @@ -37,6 +38,7 @@
"athena": AthenaEngineAdapter,
"risingwave": RisingwaveEngineAdapter,
"fabric": FabricEngineAdapter,
"gizmosql": GizmoSQLEngineAdapter,
}

DIALECT_ALIASES = {
Expand Down
Loading