Skip to content

Conversation

@jaogoy
Copy link

@jaogoy jaogoy commented Jan 13, 2026

What

  • Add StarRocks engine support to SQLMesh via StarRocks’ MySQL-compatible protocol.
  • Ship engine adapter + docs + real integration tests to ensure generated SQL works on StarRocks.

Why

  • User demand / adoption: StarRocks is a common OLAP choice; SQLMesh users want to run the same model lifecycle (build, incremental maintenance, views/MVs) on StarRocks without bespoke SQL.
  • Engine-specific semantics: StarRocks differs from vanilla MySQL in DDL/DML constraints (e.g., key types, delete behavior, rename caveats). An adapter is needed to produce correct and predictable SQL.
  • Confidence & maintainability: Documenting config patterns + codifying behavior with integration tests prevents regressions and makes support “real” (not just “it parses”).

Scope (what’s supported)

  • Connectivity: Connect through MySQL protocol (e.g., pymysql).
  • Table creation / DDL:
  • Key table types via physical_properties: DUPLICATE KEY (default), PRIMARY KEY (recommended for incremental), UNIQUE KEY
  • Partitioning: simple partitioned_by and advanced partition_by (complex expression partitioning) + optional initial partitions
  • Distribution: distributed_by structured form or string fallback (HASH / RANDOM; buckets required)
    • Ordering: order_by / clustered_by
    • Generic PROPERTIES passthrough (string key/value)
  • Views:
    • Regular views
  • Materialized views via kind VIEW(materialized true) with StarRocks-specific notes/constraints
  • DML / maintenance:
    • Insert/select/update basics
  • Delete behavior handled with StarRocks compatibility constraints (PRIMARY KEY tables recommended for robust deletes)

Changes

  • Engine adapter: sqlmesh/core/engine_adapter/starrocks.py
  • Docs: docs/integrations/engines/starrocks.md
  • Integration tests: tests/core/engine_adapter/integration/test_integration_starrocks.py, and tests/core/engine_adapter/test_starrocks.py

Verification

  • Integration tests require a running StarRocks instance.
  • Ran:
    • set STARROCKS_HOST/PORT/USER/PASSWORD
  • pytest -m "starrocks and docker" tests/core/engine_adapter/integration/test_integration_starrocks.py

Known limitations / caveats

Acknowledgement

This implementation was largely inspired by #5033 — thanks to @xinge-ji for the solid groundwork.

### What

- **Add StarRocks engine support to SQLMesh** via StarRocks’
MySQL-compatible protocol.
- Ship **engine adapter + docs + real integration tests** to ensure
generated SQL works on StarRocks.

### Why

- **User demand / adoption**: StarRocks is a common OLAP choice; SQLMesh
users want to run the same model lifecycle (build, incremental
maintenance, views/MVs) on StarRocks without bespoke SQL.
- **Engine-specific semantics**: StarRocks differs from vanilla MySQL in
DDL/DML constraints (e.g., key types, delete behavior, rename caveats).
An adapter is needed to produce correct and predictable SQL.
- **Confidence & maintainability**: Documenting config patterns +
codifying behavior with integration tests prevents regressions and makes
support “real” (not just “it parses”).

### Scope (what’s supported)

- **Connectivity**: Connect through MySQL protocol (e.g., `pymysql`).
- **Table creation / DDL**:
- Key table types via `physical_properties`: **DUPLICATE KEY
(default)**, **PRIMARY KEY (recommended for incremental)**, **UNIQUE
KEY**
- **Partitioning**: simple `partitioned_by` and advanced
`partition_by` (complex expression partitioning) + optional initial
`partitions`
- **Distribution**: `distributed_by` structured form or string
fallback (HASH / RANDOM; buckets required)
  - **Ordering**: `order_by` / `clustered_by`
  - **Generic PROPERTIES passthrough** (string key/value)
- **Views**:
  - Regular views
- **Materialized views** via `kind VIEW(materialized true)` with
StarRocks-specific notes/constraints
- **DML / maintenance**:
  - Insert/select/update basics
- Delete behavior handled with StarRocks compatibility constraints
(PRIMARY KEY tables recommended for robust deletes)

### Changes

- **Engine adapter**: `sqlmesh/core/engine_adapter/starrocks.py`
- **Docs**: `docs/integrations/engines/starrocks.md`
- **Integration tests**:
`tests/core/engine_adapter/integration/test_integration_starrocks.py`,
and `tests/core/engine_adapter/test_starrocks.py`

### Verification

- **Integration tests require a running StarRocks** instance.
- Ran:
  - set `STARROCKS_HOST/PORT/USER/PASSWORD`
- `pytest -m "starrocks and docker"
tests/core/engine_adapter/integration/test_integration_starrocks.py`

### Known limitations / caveats

- **No sync MV support (currently)**
- **No tuple IN**: `(c1, c2) IN ((v1, v2), ...)`
- **No `SELECT ... FOR UPDATE`**
- **RENAME caveat**: rename target can’t be qualified with a database
name

### Notes on compatibility

- **Changes are StarRocks-scoped** (adapter/docs/tests) and should not
impact other engines.

Signed-off-by: jaogoy <[email protected]>
@CLAassistant
Copy link

CLAassistant commented Jan 13, 2026

CLA assistant check
All committers have signed the CLA.

@jaogoy
Copy link
Author

jaogoy commented Jan 13, 2026

@erindru Hi Erin, would you like to take a review of this PR. This PR is similar with #5033, but to support StarRocks in SQLMesh.

I'll be very glad to see your comments.

I'm trying to fix the CI problem and some test cases.
And, I have a question that: for the denpendence on tobymao/sqlglot#6737, do I need to modify the dependent sqlglot version when the PR is merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants