Skip to content

Commit f3be644

Browse files
committed
feat: spanner support
- Added integration tests for CRUD operations in Spanner, including select, insert, update, and delete functionalities. - Enhanced unit tests for SpannerSyncConfig to validate connection handling and session management. - Expanded dialect tests for Spanner to cover various SQL parsing scenarios, including DDL, DML, and advanced queries. - Introduced unit tests for SpannerSyncStore to ensure transaction usage during set and delete operations. - Created new tests for Spangres dialect to validate row deletion policies and ensure proper SQL rendering. - Updated existing tests to reflect changes in TTL and interleave syntax, ensuring compatibility with Spanner's features.
1 parent 069a248 commit f3be644

File tree

18 files changed

+1909
-150
lines changed

18 files changed

+1909
-150
lines changed

docs/guides/adapters/spanner.md

Lines changed: 284 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,284 @@
1+
---
2+
orphan: true
3+
---
4+
5+
# Google Cloud Spanner Adapter Guide
6+
7+
This guide provides specific instructions for the `spanner` adapter.
8+
9+
## Key Information
10+
11+
- **Driver:** `google-cloud-spanner`
12+
- **Parameter Style:** `named` with `@` prefix (e.g., `@name`)
13+
- **Dialect:** `spanner` (custom dialect extending BigQuery)
14+
- **Transactional DDL:** Not supported (DDL uses separate admin operations)
15+
16+
## Parameter Profile
17+
18+
- **Registry Key:** `"spanner"`
19+
- **JSON Strategy:** `helper`
20+
- **Default Style:** `NAMED_AT` (parameters prefixed with `@`)
21+
22+
## Features
23+
24+
- **Full ACID Transactions:** Spanner provides global transactions with strong consistency
25+
- **Interleaved Tables:** Physical co-location of parent-child rows for performance
26+
- **Row-Level TTL:** Automatic row expiration via TTL policies
27+
- **Session Pooling:** Built-in session pool management
28+
- **UUID Handling:** Automatic UUID-to-bytes conversion
29+
- **JSON Support:** Native JSON type handling
30+
31+
## Configuration
32+
33+
### Basic Usage
34+
35+
```python
36+
from sqlspec.adapters.spanner import SpannerSyncConfig
37+
38+
config = SpannerSyncConfig(
39+
pool_config={
40+
"project": "my-project",
41+
"instance_id": "my-instance",
42+
"database_id": "my-database",
43+
}
44+
)
45+
46+
# Read-only snapshot (default)
47+
with config.provide_session() as session:
48+
result = session.select("SELECT * FROM users WHERE id = @id", {"id": "user-123"})
49+
50+
# Write-capable transaction
51+
with config.provide_session(transaction=True) as session:
52+
session.execute("UPDATE users SET active = TRUE WHERE id = @id", {"id": "user-123"})
53+
```
54+
55+
### With Emulator
56+
57+
For local development and testing, use the Spanner emulator:
58+
59+
```python
60+
from google.auth.credentials import AnonymousCredentials
61+
62+
config = SpannerSyncConfig(
63+
pool_config={
64+
"project": "test-project",
65+
"instance_id": "test-instance",
66+
"database_id": "test-database",
67+
"credentials": AnonymousCredentials(),
68+
"client_options": {"api_endpoint": "localhost:9010"},
69+
}
70+
)
71+
```
72+
73+
### Session Pool Configuration
74+
75+
```python
76+
from google.cloud.spanner_v1.pool import FixedSizePool, PingingPool
77+
78+
config = SpannerSyncConfig(
79+
pool_config={
80+
"project": "my-project",
81+
"instance_id": "my-instance",
82+
"database_id": "my-database",
83+
"pool_type": PingingPool, # or FixedSizePool (default)
84+
"min_sessions": 5,
85+
"max_sessions": 20,
86+
"ping_interval": 300, # seconds
87+
}
88+
)
89+
```
90+
91+
## Storage Bridge
92+
93+
The Spanner adapter supports the storage bridge for Arrow data import/export:
94+
95+
### Export to Storage
96+
97+
```python
98+
# Export query results to Parquet
99+
job = session.select_to_storage(
100+
"SELECT * FROM users WHERE active = @active",
101+
"gs://my-bucket/exports/users.parquet",
102+
{"active": True},
103+
format_hint="parquet",
104+
)
105+
print(f"Exported {job.telemetry['rows_processed']} rows")
106+
```
107+
108+
### Load from Arrow
109+
110+
```python
111+
import pyarrow as pa
112+
113+
# Create Arrow table
114+
table = pa.table({
115+
"id": [1, 2, 3],
116+
"name": ["Alice", "Bob", "Charlie"],
117+
"score": [95, 87, 92],
118+
})
119+
120+
# Load into Spanner table
121+
job = session.load_from_arrow("scores", table, overwrite=True)
122+
print(f"Loaded {job.telemetry['rows_processed']} rows")
123+
```
124+
125+
### Load from Storage
126+
127+
```python
128+
# Load from Parquet file
129+
job = session.load_from_storage(
130+
"users",
131+
"gs://my-bucket/imports/users.parquet",
132+
file_format="parquet",
133+
overwrite=True,
134+
)
135+
```
136+
137+
## Interleaved Tables
138+
139+
Spanner supports interleaved tables for physically co-locating parent and child rows. The custom dialect supports this syntax:
140+
141+
```python
142+
# DDL with INTERLEAVE clause (execute via database.update_ddl)
143+
ddl = """
144+
CREATE TABLE orders (
145+
customer_id STRING(36) NOT NULL,
146+
order_id STRING(36) NOT NULL,
147+
total NUMERIC,
148+
created_at TIMESTAMP
149+
) PRIMARY KEY (customer_id, order_id),
150+
INTERLEAVE IN PARENT customers ON DELETE CASCADE
151+
"""
152+
```
153+
154+
Interleaved tables provide:
155+
- Automatic co-location of related data
156+
- Efficient joins between parent and child tables
157+
- Cascading deletes for data integrity
158+
159+
## TTL Policies (GoogleSQL)
160+
161+
Spanner supports row-level TTL (row deletion policy):
162+
163+
```python
164+
# DDL with TTL policy
165+
ddl = """
166+
CREATE TABLE events (
167+
id STRING(36) NOT NULL,
168+
data JSON,
169+
created_at TIMESTAMP NOT NULL
170+
) PRIMARY KEY (id),
171+
ROW DELETION POLICY (OLDER_THAN(created_at, INTERVAL 30 DAY))
172+
"""
173+
```
174+
175+
## Litestar Integration
176+
177+
Use the Spanner session store for Litestar applications:
178+
179+
```python
180+
from litestar import Litestar
181+
from litestar.middleware.session import SessionMiddleware
182+
from sqlspec.adapters.spanner import SpannerSyncConfig
183+
from sqlspec.adapters.spanner.litestar import SpannerSyncStore
184+
185+
config = SpannerSyncConfig(
186+
pool_config={
187+
"project": "my-project",
188+
"instance_id": "my-instance",
189+
"database_id": "my-database",
190+
},
191+
extension_config={
192+
"litestar": {
193+
"table_name": "sessions",
194+
"shard_count": 10, # Optional sharding for high throughput
195+
}
196+
},
197+
)
198+
199+
store = SpannerSyncStore(config)
200+
201+
# Create session table (run once during setup)
202+
# await store.create_table()
203+
204+
app = Litestar(
205+
middleware=[SessionMiddleware(backend=store)],
206+
)
207+
208+
# Writes use transaction-backed sessions; reads use snapshots by default.
209+
```
210+
211+
### Session Store Features
212+
213+
- **Sharding:** Distribute sessions across shards for high write throughput
214+
- **TTL Support:** Automatic session expiration via Spanner TTL
215+
- **Commit Timestamps:** Automatic tracking of created_at/updated_at
216+
217+
## ADK Integration
218+
219+
Use the Spanner ADK store for session and event management:
220+
221+
```python
222+
from sqlspec.adapters.spanner import SpannerSyncConfig
223+
from sqlspec.adapters.spanner.adk import SpannerADKStore
224+
225+
config = SpannerSyncConfig(
226+
pool_config={
227+
"project": "my-project",
228+
"instance_id": "my-instance",
229+
"database_id": "my-database",
230+
},
231+
extension_config={
232+
"adk": {
233+
"sessions_table": "adk_sessions",
234+
"events_table": "adk_events",
235+
}
236+
},
237+
)
238+
239+
store = SpannerADKStore(config)
240+
241+
# Create tables (run once during setup)
242+
# store.create_tables()
243+
244+
# Create session
245+
session = store.create_session(app_name="my-app", user_id="user-123")
246+
247+
# Add event
248+
store.add_event(session.id, {"type": "page_view", "path": "/home"})
249+
250+
# List events
251+
events = store.list_events(session.id)
252+
```
253+
254+
### ADK Store Features
255+
256+
- **Interleaved Events:** Events table interleaved with sessions for efficient queries
257+
- **JSON State:** Session state stored as JSON for flexibility
258+
- **Timestamp Tracking:** Automatic created_at/updated_at timestamps
259+
260+
## Common Issues
261+
262+
- **DDL Operations:** DDL statements (CREATE TABLE, ALTER TABLE, etc.) cannot be executed through the driver's `execute()` method. Use `database.update_ddl()` for DDL operations.
263+
264+
- **Mutation Limit:** Spanner has a 20,000 mutation limit per transaction. For bulk inserts, batch operations into multiple transactions.
265+
266+
- **Read-Only Snapshots:** The default session context uses read-only snapshots. For write operations, use `database.run_in_transaction()` or configure a transaction context.
267+
268+
- **Emulator Limitations:** The Spanner emulator doesn't support all features (e.g., some complex queries, backups). Test critical functionality against a real Spanner instance.
269+
270+
- **`google.api_core.exceptions.AlreadyExists`:** Resource already exists. Check if the table or index already exists before creating.
271+
272+
- **`google.api_core.exceptions.NotFound`:** Resource not found. Verify the instance, database, and table names are correct.
273+
274+
## Best Practices
275+
276+
1. **Use Interleaved Tables:** For parent-child relationships, interleave child tables with parents for performance.
277+
278+
2. **Avoid Hotspots:** Use UUIDs or other distributed keys for primary keys to avoid write hotspots.
279+
280+
3. **Batch Writes:** Group multiple writes into single transactions when possible, staying under the 20k mutation limit.
281+
282+
4. **Use TTL:** For temporary data (sessions, events), configure TTL policies for automatic cleanup.
283+
284+
5. **Session Pooling:** Configure session pool size based on your application's concurrency needs.

docs/reference/adapters.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ Available adapters:
1717
- **DuckDB**: duckdb
1818
- **Oracle**: oracledb
1919
- **BigQuery**: bigquery
20+
- **Spanner**: spanner
2021
- **Cross-Database**: ADBC (Arrow Database Connectivity)
2122

2223
Each adapter implementation includes:

sqlspec/adapters/spanner/__init__.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,21 +2,23 @@
22

33
from sqlglot.dialects.dialect import Dialect
44

5+
from sqlspec.adapters.spanner import dialect
56
from sqlspec.adapters.spanner.config import (
67
SpannerConnectionParams,
78
SpannerDriverFeatures,
89
SpannerPoolParams,
910
SpannerSyncConfig,
1011
)
11-
from sqlspec.adapters.spanner.dialect import Spanner
1212
from sqlspec.adapters.spanner.driver import SpannerSyncDriver
1313

14-
Dialect.classes["spanner"] = Spanner
14+
Dialect.classes["spanner"] = dialect.Spanner
15+
Dialect.classes["spangres"] = dialect.Spangres
1516

1617
__all__ = (
1718
"SpannerConnectionParams",
1819
"SpannerDriverFeatures",
1920
"SpannerPoolParams",
2021
"SpannerSyncConfig",
2122
"SpannerSyncDriver",
23+
"dialect",
2224
)

sqlspec/adapters/spanner/config.py

Lines changed: 25 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,11 @@ class SpannerSyncConfig(SyncDatabaseConfig["SpannerConnection", "AbstractSession
6161
driver_type: ClassVar[type["SpannerSyncDriver"]] = SpannerSyncDriver
6262
connection_type: ClassVar[type["SpannerConnection"]] = cast("type[SpannerConnection]", SpannerConnection)
6363
supports_transactional_ddl: ClassVar[bool] = False
64+
supports_native_arrow_export: ClassVar[bool] = True
65+
supports_native_arrow_import: ClassVar[bool] = True
66+
supports_native_parquet_export: ClassVar[bool] = False
67+
supports_native_parquet_import: ClassVar[bool] = False
68+
requires_staging_for_load: ClassVar[bool] = False
6469

6570
def __init__(
6671
self,
@@ -165,24 +170,38 @@ def _close_pool(self) -> None:
165170
cast("Any", self.pool_instance).close()
166171

167172
@contextmanager
168-
def provide_connection(self, *args: Any, **kwargs: Any) -> Generator[SpannerConnection, None, None]:
169-
"""Yield a Snapshot or Transaction from the configured pool."""
173+
def provide_connection(
174+
self, *args: Any, transaction: "bool" = False, **kwargs: Any
175+
) -> Generator[SpannerConnection, None, None]:
176+
"""Yield a Snapshot (default) or Transaction from the configured pool."""
170177
database = self.get_database()
171-
with cast("Any", database).snapshot() as snapshot:
172-
yield cast("SpannerConnection", snapshot)
178+
if transaction:
179+
with cast("Any", database).transaction() as txn: # type: ignore[no-untyped-call]
180+
yield cast("SpannerConnection", txn)
181+
else:
182+
with cast("Any", database).snapshot() as snapshot:
183+
yield cast("SpannerConnection", snapshot)
173184

174185
@contextmanager
175186
def provide_session(
176-
self, *args: Any, statement_config: "StatementConfig | None" = None, **kwargs: Any
187+
self, *args: Any, statement_config: "StatementConfig | None" = None, transaction: "bool" = False, **kwargs: Any
177188
) -> Generator[SpannerSyncDriver, None, None]:
178-
with self.provide_connection(*args, **kwargs) as connection:
189+
with self.provide_connection(*args, transaction=transaction, **kwargs) as connection:
179190
driver = self.driver_type(
180191
connection=connection,
181192
statement_config=statement_config or self.statement_config,
182193
driver_features=self.driver_features,
183194
)
184195
yield self._prepare_driver(driver)
185196

197+
@contextmanager
198+
def provide_write_session(
199+
self, *args: Any, statement_config: "StatementConfig | None" = None, **kwargs: Any
200+
) -> Generator[SpannerSyncDriver, None, None]:
201+
"""Convenience wrapper that always yields a write-capable transaction session."""
202+
with self.provide_session(*args, statement_config=statement_config, transaction=True, **kwargs) as driver:
203+
yield driver
204+
186205
def get_signature_namespace(self) -> dict[str, Any]:
187206
namespace = super().get_signature_namespace()
188207
namespace.update({
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
"""Spanner dialect submodule."""
2+
3+
from sqlspec.adapters.spanner.dialect._spanner import Spanner
4+
from sqlspec.adapters.spanner.dialect._spangres import Spangres
5+
6+
__all__ = ("Spanner", "Spangres")

0 commit comments

Comments
 (0)