Skip to content

Commit ba33d13

Browse files
committed
Feat: Add StarRocks engine support
### What - **Add StarRocks engine support to SQLMesh** via StarRocks’ MySQL-compatible protocol. - Ship **engine adapter + docs + real integration tests** to ensure generated SQL works on StarRocks. ### Why - **User demand / adoption**: StarRocks is a common OLAP choice; SQLMesh users want to run the same model lifecycle (build, incremental maintenance, views/MVs) on StarRocks without bespoke SQL. - **Engine-specific semantics**: StarRocks differs from vanilla MySQL in DDL/DML constraints (e.g., key types, delete behavior, rename caveats). An adapter is needed to produce correct and predictable SQL. - **Confidence & maintainability**: Documenting config patterns + codifying behavior with integration tests prevents regressions and makes support “real” (not just “it parses”). ### Scope (what’s supported) - **Connectivity**: Connect through MySQL protocol (e.g., `pymysql`). - **Table creation / DDL**: - Key table types via `physical_properties`: **DUPLICATE KEY (default)**, **PRIMARY KEY (recommended for incremental)**, **UNIQUE KEY** - **Partitioning**: simple `partitioned_by` and advanced `partition_by` (complex expression partitioning) + optional initial `partitions` - **Distribution**: `distributed_by` structured form or string fallback (HASH / RANDOM; buckets required) - **Ordering**: `order_by` / `clustered_by` - **Generic PROPERTIES passthrough** (string key/value) - **Views**: - Regular views - **Materialized views** via `kind VIEW(materialized true)` with StarRocks-specific notes/constraints - **DML / maintenance**: - Insert/select/update basics - Delete behavior handled with StarRocks compatibility constraints (PRIMARY KEY tables recommended for robust deletes) ### Changes - **Engine adapter**: `sqlmesh/core/engine_adapter/starrocks.py` - **Docs**: `docs/integrations/engines/starrocks.md` - **Integration tests**: `tests/core/engine_adapter/integration/test_integration_starrocks.py`, and `tests/core/engine_adapter/test_starrocks.py` ### Verification - **Integration tests require a running StarRocks** instance. - Ran: - set `STARROCKS_HOST/PORT/USER/PASSWORD` - `pytest -m "starrocks and docker" tests/core/engine_adapter/integration/test_integration_starrocks.py` ### Known limitations / caveats - **No sync MV support (currently)** - **No tuple IN**: `(c1, c2) IN ((v1, v2), ...)` - **No `SELECT ... FOR UPDATE`** - **RENAME caveat**: rename target can’t be qualified with a database name ### Notes on compatibility - **Changes are StarRocks-scoped** (adapter/docs/tests) and should not impact other engines. Signed-off-by: jaogoy <jaogoy@gmail.com>
1 parent 529ed00 commit ba33d13

File tree

22 files changed

+8566
-7
lines changed

22 files changed

+8566
-7
lines changed

.circleci/continue_config.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -304,6 +304,7 @@ workflows:
304304
- spark
305305
- clickhouse
306306
- risingwave
307+
- starrocks
307308
- engine_tests_cloud:
308309
name: cloud_engine_<< matrix.engine >>
309310
context:

.circleci/wait-for-db.sh

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,34 @@ spark_ready() {
5050
probe_port 15002
5151
}
5252

53+
starrocks_ready() {
54+
probe_port 9030
55+
56+
echo "Checking for 1 alive StarRocks backends..."
57+
sleep 5
58+
59+
while true; do
60+
echo "Checking StarRocks backends..."
61+
ALIVE_BACKENDS=$(docker exec -i starrocks-fe mysql -h127.0.0.1 -P9030 -uroot -e "show backends \G" | grep -c "^ *Alive: true *$")
62+
63+
# fallback value if failed to get number
64+
if ! [[ "$ALIVE_BACKENDS" =~ ^[0-9]+$ ]]; then
65+
echo "WARN: Unable to parse number of alive backends, got: '$ALIVE_BACKENDS'"
66+
ALIVE_BACKENDS=0
67+
fi
68+
69+
echo "Found $ALIVE_BACKENDS alive backends"
70+
71+
if [ "$ALIVE_BACKENDS" -ge 1 ]; then
72+
echo "StarRocks has 1 or more alive backends"
73+
break
74+
fi
75+
76+
echo "Waiting for more backends to become alive..."
77+
sleep 5
78+
done
79+
}
80+
5381
trino_ready() {
5482
# Trino has a built-in healthcheck script, just call that
5583
docker compose -f tests/core/engine_adapter/integration/docker/compose.trino.yaml exec trino /bin/bash -c '/usr/lib/trino/bin/health-check'

.readthedocs.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ build:
66
python: "3.10"
77
jobs:
88
pre_build:
9-
- pip install -e ".[athena,azuresql,bigframes,bigquery,clickhouse,databricks,dbt,dlt,gcppostgres,github,llm,mssql,mysql,mwaa,postgres,redshift,slack,snowflake,trino,web,risingwave]"
9+
- pip install -e ".[athena,azuresql,bigframes,bigquery,clickhouse,databricks,dbt,dlt,gcppostgres,github,llm,mssql,mysql,mwaa,postgres,redshift,slack,snowflake,starrocks,trino,web,risingwave]"
1010
- make api-docs
1111

1212
mkdocs:

Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,9 @@ trino-test: engine-trino-up
208208
risingwave-test: engine-risingwave-up
209209
pytest -n auto -m "risingwave" --reruns 3 --junitxml=test-results/junit-risingwave.xml
210210

211+
starrocks-test: engine-starrocks-up
212+
pytest -n auto -m "starrocks" --reruns 3 --junitxml=test-results/junit-starrocks.xml
213+
211214
#################
212215
# Cloud Engines #
213216
#################

docs/guides/configuration.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -920,6 +920,7 @@ These pages describe the connection configuration options for each execution eng
920920
* [GCP Postgres](../integrations/engines/gcp-postgres.md)
921921
* [Redshift](../integrations/engines/redshift.md)
922922
* [Snowflake](../integrations/engines/snowflake.md)
923+
* [StarRocks](../integrations/engines/starrocks.md)
923924
* [Spark](../integrations/engines/spark.md)
924925
* [Trino](../integrations/engines/trino.md)
925926

@@ -952,6 +953,7 @@ Unsupported state engines, even for development:
952953

953954
* [ClickHouse](../integrations/engines/clickhouse.md)
954955
* [Spark](../integrations/engines/spark.md)
956+
* [StarRocks](../integrations/engines/starrocks.md)
955957
* [Trino](../integrations/engines/trino.md)
956958

957959
This example gateway configuration uses Snowflake for the data warehouse connection and Postgres for the state backend connection:

docs/guides/connections.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,4 +90,5 @@ default_gateway: local_db
9090
* [Redshift](../integrations/engines/redshift.md)
9191
* [Snowflake](../integrations/engines/snowflake.md)
9292
* [Spark](../integrations/engines/spark.md)
93+
* [StarRocks](../integrations/engines/starrocks.md)
9394
* [Trino](../integrations/engines/trino.md)

0 commit comments

Comments
 (0)