Skip to content

Commit 27e7585

Browse files
authored
Merge pull request #1 from tylerbarker/duckdb
Add support for DuckDB
2 parents efe674d + 437f004 commit 27e7585

File tree

16 files changed

+2611
-47
lines changed

16 files changed

+2611
-47
lines changed

CHANGELOG.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,30 @@
11
# Changelog
22

3+
## Unreleased
4+
5+
### Added
6+
7+
- **DuckDB support** via `duckdbex` driver
8+
- `SqlKit.DuckDB.connect/2` and `disconnect/1` for direct connections
9+
- `SqlKit.DuckDB.Pool` - NimblePool-based connection pool with supervision
10+
- File-based SQL support via `:backend` option (`backend: {:duckdb, pool: PoolName}`)
11+
- Automatic hugeint to integer conversion
12+
- PostgreSQL-style `$1, $2, ...` parameter placeholders
13+
14+
- **Prepared statement caching** for DuckDB pools
15+
- Automatic caching of prepared statements per connection
16+
- Configurable via `:cache` option (default: true)
17+
18+
- **Streaming support** for DuckDB large result sets
19+
- `SqlKit.DuckDB.stream!/3` and `stream_with_columns!/3` for direct connections
20+
- `SqlKit.DuckDB.Pool.with_stream!/5` and `with_stream_and_columns!/6` for pools
21+
- `with_stream!/3` and `with_stream_and_columns!/4` for file-based SQL modules
22+
23+
- **Pool tuning options**
24+
- `:timeout` option for checkout operations (default: 5000ms)
25+
- Lazy connection initialization
26+
- Documented pool behavior and configuration
27+
328
## 0.1.0
429

530
- Initial release

CLAUDE.md

Lines changed: 263 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,263 @@
1+
# SqlKit
2+
3+
An Elixir library for executing raw SQL with automatic result transformation.
4+
5+
## Overview
6+
7+
SqlKit provides two ways to execute raw SQL with results automatically transformed into maps or structs:
8+
9+
1. **Direct SQL execution** - Standalone functions for executing SQL strings with any Ecto repo
10+
2. **File-based SQL** - Macro-based approach for compile-time embedded SQL files
11+
12+
SQL files are embedded at compile-time for production (stored in module attributes) while reading from disk in dev/test for rapid iteration. Supports Postgrex, MyXQL, Exqlite, Tds, Ch (ClickHouse), and DuckDB.
13+
14+
## Project Structure
15+
16+
```
17+
lib/
18+
sql_kit.ex # Main module with `use SqlKit` macro + standalone query functions
19+
sql_kit/
20+
config.ex # Runtime config (root_sql_dir, load_sql)
21+
helpers.ex # Compile-time helpers (file_atom)
22+
exceptions.ex # NoResultsError, MultipleResultsError
23+
query.ex # Core query execution logic (shared by both APIs)
24+
duckdb.ex # DuckDB connection management (conditional on duckdbex)
25+
duckdb/
26+
pool.ex # NimblePool-based connection pool for DuckDB
27+
test/
28+
sql_kit_test.exs # Main tests covering all databases and both APIs
29+
sql_kit/
30+
helpers_test.exs # Helpers module tests
31+
duckdb_test.exs # DuckDB-specific tests
32+
support/
33+
repos.ex # Test Ecto repos (Postgres, MySQL, SQLite, TDS, ClickHouse)
34+
sql/ # Test SQL files per database
35+
data_case.ex # Test case template
36+
test_setup.ex # Database setup/teardown
37+
test_sql_modules.ex # Test SqlKit modules
38+
```
39+
40+
## Key Technical Decisions
41+
42+
- **Two APIs**: Standalone functions for direct SQL execution + macro-generated functions for file-based SQL
43+
- **Shared execution logic**: Both APIs delegate to `SqlKit.Query` for consistent behavior
44+
- **Compile-time embedding**: SQL files are read once at compile time and stored as module attributes with `persist: true`
45+
- **Runtime file reads in dev/test**: Allows editing SQL without recompilation via `:load_sql` config
46+
- **Direct driver support**: Pattern matches on result structs with `columns` and `rows` fields
47+
- **Atom safety**: Use `String.to_existing_atom/1` for column names (requires struct fields to pre-exist)
48+
- **Configurable**: `otp_app`, `repo`/`backend`, `dirname`, and `root_sql_dir` are configurable
49+
- **Backend abstraction**: File-based SQL supports both Ecto repos (`:repo`) and DuckDB pools (`:backend`)
50+
51+
## Core API
52+
53+
### Standalone Functions (SqlKit module)
54+
55+
```elixir
56+
# Execute SQL strings directly with any Ecto repo
57+
SqlKit.query_all!(MyApp.Repo, "SELECT * FROM users WHERE age > $1", [21])
58+
# => [%{id: 1, name: "Alice", age: 30}, ...]
59+
60+
SqlKit.query_one!(MyApp.Repo, "SELECT * FROM users WHERE id = $1", [1])
61+
# => %{id: 1, name: "Alice"}
62+
63+
SqlKit.query_all!(MyApp.Repo, "SELECT * FROM users", [], as: User)
64+
# => [%User{id: 1, name: "Alice"}, ...]
65+
66+
# Non-bang variants
67+
SqlKit.query_all(repo, sql, params, opts) # => {:ok, results} | {:error, reason}
68+
SqlKit.query_one(repo, sql, params, opts) # => {:ok, result | nil} | {:error, reason}
69+
70+
# Aliases for query_one
71+
SqlKit.query!(repo, sql, params, opts)
72+
SqlKit.query(repo, sql, params, opts)
73+
```
74+
75+
### File-Based Functions (generated by `use SqlKit`)
76+
77+
```elixir
78+
# With Ecto repo
79+
defmodule MyApp.Reports.SQL do
80+
use SqlKit,
81+
otp_app: :my_app,
82+
repo: MyApp.Repo,
83+
dirname: "reports",
84+
files: ["stats.sql", "activity.sql"]
85+
end
86+
87+
# With DuckDB pool (use :backend instead of :repo)
88+
defmodule MyApp.Analytics.SQL do
89+
use SqlKit,
90+
otp_app: :my_app,
91+
backend: {:duckdb, pool: MyApp.AnalyticsPool},
92+
dirname: "analytics",
93+
files: ["daily_summary.sql"]
94+
end
95+
96+
# Usage (same API for both)
97+
MyApp.Reports.SQL.query!("stats.sql", [id]) # single row (alias for query_one!)
98+
MyApp.Reports.SQL.query_one!("stats.sql", [id]) # single row
99+
MyApp.Reports.SQL.query_all!("activity.sql", [id], as: Activity) # all rows as structs
100+
MyApp.Reports.SQL.load!("stats.sql") # just get SQL string
101+
102+
# Non-bang variants return {:ok, result} | {:error, reason}
103+
MyApp.Reports.SQL.query("stats.sql", [id])
104+
MyApp.Reports.SQL.query_one("stats.sql", [id])
105+
MyApp.Reports.SQL.query_all("activity.sql", [id])
106+
MyApp.Reports.SQL.load("stats.sql")
107+
```
108+
109+
### Utility Functions
110+
111+
```elixir
112+
# Transform raw columns/rows into maps or structs (used internally, also public)
113+
SqlKit.transform_rows(["id", "name"], [[1, "Alice"]], as: User)
114+
# => [%User{id: 1, name: "Alice"}]
115+
116+
# Extract columns and rows from driver result
117+
SqlKit.extract_result(result)
118+
# => {["id", "name"], [[1, "Alice"]]}
119+
```
120+
121+
## Supported Databases
122+
123+
| Database | Ecto Adapter | Driver |
124+
|------------|---------------------------|----------|
125+
| PostgreSQL | Ecto.Adapters.Postgres | Postgrex |
126+
| MySQL | Ecto.Adapters.MyXQL | MyXQL |
127+
| SQLite | Ecto.Adapters.SQLite3 | Exqlite |
128+
| SQL Server | Ecto.Adapters.Tds | Tds |
129+
| ClickHouse | Ecto.Adapters.ClickHouse | Ch |
130+
| DuckDB | N/A (direct driver) | Duckdbex |
131+
132+
### DuckDB Support
133+
134+
DuckDB is unique - it's not an Ecto adapter but a direct NIF driver. SqlKit provides first-class support:
135+
136+
```elixir
137+
# Direct connection (BYO)
138+
{:ok, conn} = SqlKit.DuckDB.connect(":memory:")
139+
SqlKit.query_all!(conn, "SELECT * FROM users", [])
140+
SqlKit.DuckDB.disconnect(conn)
141+
142+
# Pooled connection (recommended for production)
143+
# Add to supervision tree:
144+
{SqlKit.DuckDB.Pool, name: MyPool, database: "analytics.duckdb", pool_size: 4}
145+
146+
# With custom Duckdbex config:
147+
{SqlKit.DuckDB.Pool, name: MyPool, database: "analytics.duckdb", pool_size: 4,
148+
config: %Duckdbex.Config{threads: 4}}
149+
150+
# Then use the pool:
151+
{:ok, pool} = SqlKit.DuckDB.Pool.start_link(name: MyPool, database: ":memory:")
152+
SqlKit.query_all!(pool, "SELECT * FROM events", [])
153+
154+
# File-based SQL with DuckDB (use :backend instead of :repo)
155+
defmodule MyApp.Analytics.SQL do
156+
use SqlKit,
157+
otp_app: :my_app,
158+
backend: {:duckdb, pool: MyApp.AnalyticsPool},
159+
dirname: "analytics",
160+
files: ["daily_summary.sql"]
161+
end
162+
163+
MyApp.Analytics.SQL.query_all!("daily_summary.sql", [~D[2024-01-01]])
164+
```
165+
166+
Key differences from Ecto-based databases:
167+
- Uses PostgreSQL-style `$1, $2, ...` parameter placeholders
168+
- In-memory database: use `":memory:"` string
169+
- Pool uses NimblePool (connections share one database instance)
170+
- Hugeint values auto-converted to Elixir integers
171+
- Extensions loaded via SQL: `INSTALL 'parquet'; LOAD 'parquet';`
172+
- File-based SQL uses `:backend` option instead of `:repo`
173+
174+
### DuckDB Pool Features
175+
176+
**Prepared Statement Caching**: Pool queries automatically cache prepared statements per connection. Repeated queries with the same SQL skip the prepare step.
177+
178+
```elixir
179+
# Caching is enabled by default
180+
SqlKit.query_all!(pool, "SELECT * FROM events WHERE id = $1", [1])
181+
SqlKit.query_all!(pool, "SELECT * FROM events WHERE id = $1", [2]) # uses cached statement
182+
183+
# Disable caching for specific queries
184+
SqlKit.DuckDB.Pool.query!(pool, sql, params, cache: false)
185+
```
186+
187+
**Streaming Large Results**: For memory-efficient processing of large result sets:
188+
189+
```elixir
190+
# Direct connection streaming
191+
conn
192+
|> SqlKit.DuckDB.stream!("SELECT * FROM large_table", [])
193+
|> Stream.flat_map(& &1)
194+
|> Enum.take(100)
195+
196+
# With column names
197+
{columns, stream} = SqlKit.DuckDB.stream_with_columns!(conn, sql, [])
198+
199+
# Pool streaming (callback-based to manage connection lifecycle)
200+
SqlKit.DuckDB.Pool.with_stream!(pool, "SELECT * FROM events", [], fn stream ->
201+
stream |> Stream.flat_map(& &1) |> Enum.count()
202+
end)
203+
204+
# File-based SQL streaming (DuckDB backends only)
205+
MyApp.Analytics.SQL.with_stream!("large_query.sql", [], fn stream ->
206+
stream |> Stream.flat_map(& &1) |> Enum.take(1000)
207+
end)
208+
```
209+
210+
**Pool Timeout**: All pool operations accept a `:timeout` option (default: 5000ms):
211+
212+
```elixir
213+
SqlKit.DuckDB.Pool.query!(pool, sql, params, timeout: 10_000)
214+
SqlKit.DuckDB.Pool.checkout!(pool, fn conn -> ... end, timeout: 10_000)
215+
```
216+
217+
## Configuration
218+
219+
Users configure in their app's config:
220+
221+
```elixir
222+
# config/config.exs
223+
config :my_app, SqlKit,
224+
root_sql_dir: "priv/repo/sql" # default
225+
226+
# config/dev.exs and config/test.exs
227+
config :my_app, SqlKit,
228+
load_sql: :dynamic # read from disk at runtime
229+
230+
# config/prod.exs (or rely on default)
231+
config :my_app, SqlKit,
232+
load_sql: :compiled # use compile-time embedded SQL (default)
233+
```
234+
235+
## Commands
236+
237+
```bash
238+
mix check # Run all checks (format, compile, dialyzer, credo, sobelow, test)
239+
mix test # Run tests (requires all databases running via Docker)
240+
mix format # Format code
241+
mix credo # Linting
242+
mix dialyzer # Type checking
243+
mix sobelow # Security analysis
244+
```
245+
246+
## Dependencies
247+
248+
Runtime:
249+
- ecto_sql ~> 3.0
250+
- nimble_pool ~> 1.1
251+
- postgrex ~> 0.19 (optional)
252+
- myxql ~> 0.7 (optional)
253+
- ecto_sqlite3 ~> 0.18 (optional)
254+
- tds ~> 2.3 (optional)
255+
- ecto_ch ~> 0.7 (optional)
256+
- duckdbex ~> 0.3.19 (optional)
257+
258+
Dev/Test:
259+
- ex_check ~> 0.16
260+
- credo ~> 1.7
261+
- dialyxir ~> 1.4
262+
- sobelow ~> 0.14
263+
- styler ~> 1.10

0 commit comments

Comments
 (0)