tylerbarker · tylerbarker · Dec 29, 2025 · Dec 20, 2025 · Dec 24, 2025 · Dec 29, 2025
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,30 @@
 # Changelog
 
+## Unreleased
+
+### Added
+
+- **DuckDB support** via `duckdbex` driver
+  - `SqlKit.DuckDB.connect/2` and `disconnect/1` for direct connections
+  - `SqlKit.DuckDB.Pool` - NimblePool-based connection pool with supervision
+  - File-based SQL support via `:backend` option (`backend: {:duckdb, pool: PoolName}`)
+  - Automatic hugeint to integer conversion
+  - PostgreSQL-style `$1, $2, ...` parameter placeholders
+
+- **Prepared statement caching** for DuckDB pools
+  - Automatic caching of prepared statements per connection
+  - Configurable via `:cache` option (default: true)
+
+- **Streaming support** for DuckDB large result sets
+  - `SqlKit.DuckDB.stream!/3` and `stream_with_columns!/3` for direct connections
+  - `SqlKit.DuckDB.Pool.with_stream!/5` and `with_stream_and_columns!/6` for pools
+  - `with_stream!/3` and `with_stream_and_columns!/4` for file-based SQL modules
+
+- **Pool tuning options**
+  - `:timeout` option for checkout operations (default: 5000ms)
+  - Lazy connection initialization
+  - Documented pool behavior and configuration
+
 ## 0.1.0
 
 - Initial release
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,263 @@
+# SqlKit
+
+An Elixir library for executing raw SQL with automatic result transformation.
+
+## Overview
+
+SqlKit provides two ways to execute raw SQL with results automatically transformed into maps or structs:
+
+1. **Direct SQL execution** - Standalone functions for executing SQL strings with any Ecto repo
+2. **File-based SQL** - Macro-based approach for compile-time embedded SQL files
+
+SQL files are embedded at compile-time for production (stored in module attributes) while reading from disk in dev/test for rapid iteration. Supports Postgrex, MyXQL, Exqlite, Tds, Ch (ClickHouse), and DuckDB.
+
+## Project Structure
+
+```
+lib/
+  sql_kit.ex               # Main module with `use SqlKit` macro + standalone query functions
+  sql_kit/
+    config.ex              # Runtime config (root_sql_dir, load_sql)
+    helpers.ex             # Compile-time helpers (file_atom)
+    exceptions.ex          # NoResultsError, MultipleResultsError
+    query.ex               # Core query execution logic (shared by both APIs)
+    duckdb.ex              # DuckDB connection management (conditional on duckdbex)
+    duckdb/
+      pool.ex              # NimblePool-based connection pool for DuckDB
+test/
+  sql_kit_test.exs         # Main tests covering all databases and both APIs
+  sql_kit/
+    helpers_test.exs       # Helpers module tests
+    duckdb_test.exs        # DuckDB-specific tests
+  support/
+    repos.ex               # Test Ecto repos (Postgres, MySQL, SQLite, TDS, ClickHouse)
+    sql/                   # Test SQL files per database
+    data_case.ex           # Test case template
+    test_setup.ex          # Database setup/teardown
+    test_sql_modules.ex    # Test SqlKit modules
+```
+
+## Key Technical Decisions
+
+- **Two APIs**: Standalone functions for direct SQL execution + macro-generated functions for file-based SQL
+- **Shared execution logic**: Both APIs delegate to `SqlKit.Query` for consistent behavior
+- **Compile-time embedding**: SQL files are read once at compile time and stored as module attributes with `persist: true`
+- **Runtime file reads in dev/test**: Allows editing SQL without recompilation via `:load_sql` config
+- **Direct driver support**: Pattern matches on result structs with `columns` and `rows` fields
+- **Atom safety**: Use `String.to_existing_atom/1` for column names (requires struct fields to pre-exist)
+- **Configurable**: `otp_app`, `repo`/`backend`, `dirname`, and `root_sql_dir` are configurable
+- **Backend abstraction**: File-based SQL supports both Ecto repos (`:repo`) and DuckDB pools (`:backend`)
+
+## Core API
+
+### Standalone Functions (SqlKit module)
+
+```elixir
+# Execute SQL strings directly with any Ecto repo
+SqlKit.query_all!(MyApp.Repo, "SELECT * FROM users WHERE age > $1", [21])
+# => [%{id: 1, name: "Alice", age: 30}, ...]
+
+SqlKit.query_one!(MyApp.Repo, "SELECT * FROM users WHERE id = $1", [1])
+# => %{id: 1, name: "Alice"}
+
+SqlKit.query_all!(MyApp.Repo, "SELECT * FROM users", [], as: User)
+# => [%User{id: 1, name: "Alice"}, ...]
+
+# Non-bang variants
+SqlKit.query_all(repo, sql, params, opts)  # => {:ok, results} | {:error, reason}
+SqlKit.query_one(repo, sql, params, opts)  # => {:ok, result | nil} | {:error, reason}
+
+# Aliases for query_one
+SqlKit.query!(repo, sql, params, opts)
+SqlKit.query(repo, sql, params, opts)
+```
+
+### File-Based Functions (generated by `use SqlKit`)
+
+```elixir
+# With Ecto repo
+defmodule MyApp.Reports.SQL do
+  use SqlKit,
+    otp_app: :my_app,
+    repo: MyApp.Repo,
+    dirname: "reports",
+    files: ["stats.sql", "activity.sql"]
+end
+
+# With DuckDB pool (use :backend instead of :repo)
+defmodule MyApp.Analytics.SQL do
+  use SqlKit,
+    otp_app: :my_app,
+    backend: {:duckdb, pool: MyApp.AnalyticsPool},
+    dirname: "analytics",
+    files: ["daily_summary.sql"]
+end
+
+# Usage (same API for both)
+MyApp.Reports.SQL.query!("stats.sql", [id])                # single row (alias for query_one!)
+MyApp.Reports.SQL.query_one!("stats.sql", [id])            # single row
+MyApp.Reports.SQL.query_all!("activity.sql", [id], as: Activity)  # all rows as structs
+MyApp.Reports.SQL.load!("stats.sql")                       # just get SQL string
+
+# Non-bang variants return {:ok, result} | {:error, reason}
+MyApp.Reports.SQL.query("stats.sql", [id])
+MyApp.Reports.SQL.query_one("stats.sql", [id])
+MyApp.Reports.SQL.query_all("activity.sql", [id])
+MyApp.Reports.SQL.load("stats.sql")
+```
+
+### Utility Functions
+
+```elixir
+# Transform raw columns/rows into maps or structs (used internally, also public)
+SqlKit.transform_rows(["id", "name"], [[1, "Alice"]], as: User)
+# => [%User{id: 1, name: "Alice"}]
+
+# Extract columns and rows from driver result
+SqlKit.extract_result(result)
+# => {["id", "name"], [[1, "Alice"]]}
+```
+
+## Supported Databases
+
+| Database   | Ecto Adapter              | Driver   |
+|------------|---------------------------|----------|
+| PostgreSQL | Ecto.Adapters.Postgres    | Postgrex |
+| MySQL      | Ecto.Adapters.MyXQL       | MyXQL    |
+| SQLite     | Ecto.Adapters.SQLite3     | Exqlite  |
+| SQL Server | Ecto.Adapters.Tds         | Tds      |
+| ClickHouse | Ecto.Adapters.ClickHouse  | Ch       |
+| DuckDB     | N/A (direct driver)       | Duckdbex |
+
+### DuckDB Support
+
+DuckDB is unique - it's not an Ecto adapter but a direct NIF driver. SqlKit provides first-class support:
+
+```elixir
+# Direct connection (BYO)
+{:ok, conn} = SqlKit.DuckDB.connect(":memory:")
+SqlKit.query_all!(conn, "SELECT * FROM users", [])
+SqlKit.DuckDB.disconnect(conn)
+
+# Pooled connection (recommended for production)
+# Add to supervision tree:
+{SqlKit.DuckDB.Pool, name: MyPool, database: "analytics.duckdb", pool_size: 4}
+
+# With custom Duckdbex config:
+{SqlKit.DuckDB.Pool, name: MyPool, database: "analytics.duckdb", pool_size: 4,
+  config: %Duckdbex.Config{threads: 4}}
+
+# Then use the pool:
+{:ok, pool} = SqlKit.DuckDB.Pool.start_link(name: MyPool, database: ":memory:")
+SqlKit.query_all!(pool, "SELECT * FROM events", [])
+
+# File-based SQL with DuckDB (use :backend instead of :repo)
+defmodule MyApp.Analytics.SQL do
+  use SqlKit,
+    otp_app: :my_app,
+    backend: {:duckdb, pool: MyApp.AnalyticsPool},
+    dirname: "analytics",
+    files: ["daily_summary.sql"]
+end
+
+MyApp.Analytics.SQL.query_all!("daily_summary.sql", [~D[2024-01-01]])
+```
+
+Key differences from Ecto-based databases:
+- Uses PostgreSQL-style `$1, $2, ...` parameter placeholders
+- In-memory database: use `":memory:"` string
+- Pool uses NimblePool (connections share one database instance)
+- Hugeint values auto-converted to Elixir integers
+- Extensions loaded via SQL: `INSTALL 'parquet'; LOAD 'parquet';`
+- File-based SQL uses `:backend` option instead of `:repo`
+
+### DuckDB Pool Features
+
+**Prepared Statement Caching**: Pool queries automatically cache prepared statements per connection. Repeated queries with the same SQL skip the prepare step.
+
+```elixir
+# Caching is enabled by default
+SqlKit.query_all!(pool, "SELECT * FROM events WHERE id = $1", [1])
+SqlKit.query_all!(pool, "SELECT * FROM events WHERE id = $1", [2])  # uses cached statement
+
+# Disable caching for specific queries
+SqlKit.DuckDB.Pool.query!(pool, sql, params, cache: false)
+```
+
+**Streaming Large Results**: For memory-efficient processing of large result sets:
+
+```elixir
+# Direct connection streaming
+conn
+|> SqlKit.DuckDB.stream!("SELECT * FROM large_table", [])
+|> Stream.flat_map(& &1)
+|> Enum.take(100)
+
+# With column names
+{columns, stream} = SqlKit.DuckDB.stream_with_columns!(conn, sql, [])
+
+# Pool streaming (callback-based to manage connection lifecycle)
+SqlKit.DuckDB.Pool.with_stream!(pool, "SELECT * FROM events", [], fn stream ->
+  stream |> Stream.flat_map(& &1) |> Enum.count()
+end)
+
+# File-based SQL streaming (DuckDB backends only)
+MyApp.Analytics.SQL.with_stream!("large_query.sql", [], fn stream ->
+  stream |> Stream.flat_map(& &1) |> Enum.take(1000)
+end)
+```
+
+**Pool Timeout**: All pool operations accept a `:timeout` option (default: 5000ms):
+
+```elixir
+SqlKit.DuckDB.Pool.query!(pool, sql, params, timeout: 10_000)
+SqlKit.DuckDB.Pool.checkout!(pool, fn conn -> ... end, timeout: 10_000)
+```
+
+## Configuration
+
+Users configure in their app's config:
+
+```elixir
+# config/config.exs
+config :my_app, SqlKit,
+  root_sql_dir: "priv/repo/sql"  # default
+
+# config/dev.exs and config/test.exs
+config :my_app, SqlKit,
+  load_sql: :dynamic  # read from disk at runtime
+
+# config/prod.exs (or rely on default)
+config :my_app, SqlKit,
+  load_sql: :compiled  # use compile-time embedded SQL (default)
+```
+
+## Commands
+
+```bash
+mix check                   # Run all checks (format, compile, dialyzer, credo, sobelow, test)
+mix test                    # Run tests (requires all databases running via Docker)
+mix format                  # Format code
+mix credo                   # Linting
+mix dialyzer                # Type checking
+mix sobelow                 # Security analysis
+```
+
+## Dependencies
+
+Runtime:
+- ecto_sql ~> 3.0
+- nimble_pool ~> 1.1
+- postgrex ~> 0.19 (optional)
+- myxql ~> 0.7 (optional)
+- ecto_sqlite3 ~> 0.18 (optional)
+- tds ~> 2.3 (optional)
+- ecto_ch ~> 0.7 (optional)
+- duckdbex ~> 0.3.19 (optional)
+
+Dev/Test:
+- ex_check ~> 0.16
+- credo ~> 1.7
+- dialyxir ~> 1.4
+- sobelow ~> 0.14
+- styler ~> 1.10