Skip to content

Latest commit

 

History

History
263 lines (206 loc) · 9 KB

File metadata and controls

263 lines (206 loc) · 9 KB

SqlKit

An Elixir library for executing raw SQL with automatic result transformation.

Overview

SqlKit provides two ways to execute raw SQL with results automatically transformed into maps or structs:

  1. Direct SQL execution - Standalone functions for executing SQL strings with any Ecto repo
  2. File-based SQL - Macro-based approach for compile-time embedded SQL files

SQL files are embedded at compile-time for production (stored in module attributes) while reading from disk in dev/test for rapid iteration. Supports Postgrex, MyXQL, Exqlite, Tds, Ch (ClickHouse), and DuckDB.

Project Structure

lib/
  sql_kit.ex               # Main module with `use SqlKit` macro + standalone query functions
  sql_kit/
    config.ex              # Runtime config (root_sql_dir, load_sql)
    helpers.ex             # Compile-time helpers (file_atom)
    exceptions.ex          # NoResultsError, MultipleResultsError
    query.ex               # Core query execution logic (shared by both APIs)
    duckdb.ex              # DuckDB connection management (conditional on duckdbex)
    duckdb/
      pool.ex              # NimblePool-based connection pool for DuckDB
test/
  sql_kit_test.exs         # Main tests covering all databases and both APIs
  sql_kit/
    helpers_test.exs       # Helpers module tests
    duckdb_test.exs        # DuckDB-specific tests
  support/
    repos.ex               # Test Ecto repos (Postgres, MySQL, SQLite, TDS, ClickHouse)
    sql/                   # Test SQL files per database
    data_case.ex           # Test case template
    test_setup.ex          # Database setup/teardown
    test_sql_modules.ex    # Test SqlKit modules

Key Technical Decisions

  • Two APIs: Standalone functions for direct SQL execution + macro-generated functions for file-based SQL
  • Shared execution logic: Both APIs delegate to SqlKit.Query for consistent behavior
  • Compile-time embedding: SQL files are read once at compile time and stored as module attributes with persist: true
  • Runtime file reads in dev/test: Allows editing SQL without recompilation via :load_sql config
  • Direct driver support: Pattern matches on result structs with columns and rows fields
  • Atom safety: Use String.to_existing_atom/1 for column names (requires struct fields to pre-exist)
  • Configurable: otp_app, repo/backend, dirname, and root_sql_dir are configurable
  • Backend abstraction: File-based SQL supports both Ecto repos (:repo) and DuckDB pools (:backend)

Core API

Standalone Functions (SqlKit module)

# Execute SQL strings directly with any Ecto repo
SqlKit.query_all(MyApp.Repo, "SELECT * FROM users WHERE age > $1", [21])
# => [%{id: 1, name: "Alice", age: 30}, ...]

SqlKit.query_one!(MyApp.Repo, "SELECT * FROM users WHERE id = $1", [1])
# => %{id: 1, name: "Alice"}

SqlKit.query_all(MyApp.Repo, "SELECT * FROM users", [], as: User)
# => [%User{id: 1, name: "Alice"}, ...]

# query_all returns list directly, raises on errors (matches Ecto.Repo.all/2)
SqlKit.query_all(repo, sql, params, opts)  # => [results]

# query_one returns result or nil, raises on errors/multiple (matches Ecto.Repo.one/2)
SqlKit.query_one(repo, sql, params, opts)  # => result | nil

# Aliases for query_one
SqlKit.query!(repo, sql, params, opts)
SqlKit.query(repo, sql, params, opts)

File-Based Functions (generated by use SqlKit)

# With Ecto repo
defmodule MyApp.Reports.SQL do
  use SqlKit,
    otp_app: :my_app,
    repo: MyApp.Repo,
    dirname: "reports",
    files: ["stats.sql", "activity.sql"]
end

# With DuckDB pool (use :backend instead of :repo)
defmodule MyApp.Analytics.SQL do
  use SqlKit,
    otp_app: :my_app,
    backend: {:duckdb, pool: MyApp.AnalyticsPool},
    dirname: "analytics",
    files: ["daily_summary.sql"]
end

# Usage (same API for both)
MyApp.Reports.SQL.query!("stats.sql", [id])                # single row (alias for query_one!)
MyApp.Reports.SQL.query_one!("stats.sql", [id])            # single row (raises if no results)
MyApp.Reports.SQL.query_one("stats.sql", [id])             # single row or nil
MyApp.Reports.SQL.query_all("activity.sql", [id], as: Activity)  # all rows as structs
MyApp.Reports.SQL.load!("stats.sql")                       # just get SQL string

# Non-bang load returns {:ok, result} | {:error, reason}
MyApp.Reports.SQL.load("stats.sql")

Utility Functions

# Transform raw columns/rows into maps or structs (used internally, also public)
SqlKit.transform_rows(["id", "name"], [[1, "Alice"]], as: User)
# => [%User{id: 1, name: "Alice"}]

# Extract columns and rows from driver result
SqlKit.extract_result(result)
# => {["id", "name"], [[1, "Alice"]]}

Supported Databases

Database Ecto Adapter Driver
PostgreSQL Ecto.Adapters.Postgres Postgrex
MySQL Ecto.Adapters.MyXQL MyXQL
SQLite Ecto.Adapters.SQLite3 Exqlite
SQL Server Ecto.Adapters.Tds Tds
ClickHouse Ecto.Adapters.ClickHouse Ch
DuckDB N/A (direct driver) Duckdbex

DuckDB Support

DuckDB is unique - it's not an Ecto adapter but a direct NIF driver. SqlKit provides first-class support:

# Direct connection (BYO)
{:ok, conn} = SqlKit.DuckDB.connect(":memory:")
SqlKit.query_all(conn, "SELECT * FROM users", [])
SqlKit.DuckDB.disconnect(conn)

# Pooled connection (recommended for production)
# Add to supervision tree:
{SqlKit.DuckDB.Pool, name: MyPool, database: "analytics.duckdb", pool_size: 4}

# With custom Duckdbex config:
{SqlKit.DuckDB.Pool, name: MyPool, database: "analytics.duckdb", pool_size: 4,
  config: %Duckdbex.Config{threads: 4}}

# Then use the pool:
{:ok, pool} = SqlKit.DuckDB.Pool.start_link(name: MyPool, database: ":memory:")
SqlKit.query_all(pool, "SELECT * FROM events", [])

# File-based SQL with DuckDB (use :backend instead of :repo)
defmodule MyApp.Analytics.SQL do
  use SqlKit,
    otp_app: :my_app,
    backend: {:duckdb, pool: MyApp.AnalyticsPool},
    dirname: "analytics",
    files: ["daily_summary.sql"]
end

MyApp.Analytics.SQL.query_all("daily_summary.sql", [~D[2024-01-01]])

Key differences from Ecto-based databases:

  • Uses PostgreSQL-style $1, $2, ... parameter placeholders
  • In-memory database: use ":memory:" string
  • Pool uses NimblePool (connections share one database instance)
  • Hugeint values auto-converted to Elixir integers
  • Extensions loaded via SQL: INSTALL 'parquet'; LOAD 'parquet';
  • File-based SQL uses :backend option instead of :repo

DuckDB Pool Features

Prepared Statement Caching: Pool queries automatically cache prepared statements per connection. Repeated queries with the same SQL skip the prepare step.

# Caching is enabled by default
SqlKit.query_all(pool, "SELECT * FROM events WHERE id = $1", [1])
SqlKit.query_all(pool, "SELECT * FROM events WHERE id = $1", [2])  # uses cached statement

# Disable caching for specific queries
SqlKit.DuckDB.Pool.query!(pool, sql, params, cache: false)

Streaming Large Results: For memory-efficient processing of large result sets:

# Direct connection streaming
conn
|> SqlKit.DuckDB.stream!("SELECT * FROM large_table", [])
|> Stream.flat_map(& &1)
|> Enum.take(100)

# With column names
{columns, stream} = SqlKit.DuckDB.stream_with_columns!(conn, sql, [])

# Pool streaming (callback-based to manage connection lifecycle)
SqlKit.DuckDB.Pool.with_stream!(pool, "SELECT * FROM events", [], fn stream ->
  stream |> Stream.flat_map(& &1) |> Enum.count()
end)

# File-based SQL streaming (DuckDB backends only)
MyApp.Analytics.SQL.with_stream!("large_query.sql", [], fn stream ->
  stream |> Stream.flat_map(& &1) |> Enum.take(1000)
end)

Pool Timeout: All pool operations accept a :timeout option (default: 5000ms):

SqlKit.DuckDB.Pool.query!(pool, sql, params, timeout: 10_000)
SqlKit.DuckDB.Pool.checkout!(pool, fn conn -> ... end, timeout: 10_000)

Configuration

Users configure in their app's config:

# config/config.exs
config :my_app, SqlKit,
  root_sql_dir: "priv/repo/sql"  # default

# config/dev.exs and config/test.exs
config :my_app, SqlKit,
  load_sql: :dynamic  # read from disk at runtime

# config/prod.exs (or rely on default)
config :my_app, SqlKit,
  load_sql: :compiled  # use compile-time embedded SQL (default)

Commands

mix check                   # Run all checks (format, compile, dialyzer, credo, sobelow, test)
mix test                    # Run tests (requires all databases running via Docker)
mix format                  # Format code
mix credo                   # Linting
mix dialyzer                # Type checking
mix sobelow                 # Security analysis

Dependencies

Runtime:

  • ecto_sql ~> 3.0
  • nimble_pool ~> 1.1
  • postgrex ~> 0.19 (optional)
  • myxql ~> 0.7 (optional)
  • ecto_sqlite3 ~> 0.18 (optional)
  • tds ~> 2.3 (optional)
  • ecto_ch ~> 0.7 (optional)
  • duckdbex ~> 0.3.19 (optional)

Dev/Test:

  • ex_check ~> 0.16
  • credo ~> 1.7
  • dialyxir ~> 1.4
  • sobelow ~> 0.14
  • styler ~> 1.10