Databend Loki Adapter

Databend Loki Adapter exposes a minimal Loki-compatible HTTP API. It parses LogQL queries from Grafana, converts them to Databend SQL, runs the statements, and returns Loki-formatted JSON responses.

Getting Started

export DATABEND_DSN="databend://user:pass@host:port/default"
databend-loki-adapter --table nginx_logs --schema-type flat

The adapter listens on --bind (default 0.0.0.0:3100) and exposes a minimal subset of the Loki HTTP surface area.

Configuration

Flag	Env	Default	Description
`--mode`	`ADAPTER_MODE`	`standalone`	`standalone` uses a fixed DSN/table, `proxy` pulls both from HTTP headers.
`--dsn`	`DATABEND_DSN`	required	Databend DSN with credentials (`proxy` mode expects it via header).
`--table`	`LOGS_TABLE`	`logs`	Target table. Use `db.table` or rely on the DSN default database.
`--bind`	`BIND_ADDR`	`0.0.0.0:3100`	HTTP bind address.
`--schema-type`	`SCHEMA_TYPE`	`loki`	`loki` (labels as VARIANT) or `flat` (wide table).
`--timestamp-column`	`TIMESTAMP_COLUMN`	auto-detect	Override the timestamp column name.
`--line-column`	`LINE_COLUMN`	auto-detect	Override the log line column name.
`--labels-column`	`LABELS_COLUMN`	auto-detect (loki only)	Override the labels column name.
`--max-metric-buckets`	`MAX_METRIC_BUCKETS`	`240`	Maximum bucket count per metric range query before clamping `step`.

Run Modes

databend-loki-adapter runs in two modes:

standalone (default): supply --dsn, --table, and schema overrides on the CLI/env. The adapter resolves the table once at startup and caches the schema for the entire process lifetime.
proxy: launch the server with --mode proxy and omit --dsn. Each HTTP request must pass the Databend DSN and schema definition in headers so multiple tenants/tables can share one adapter instance.

Proxy headers

Proxy mode expects two headers on every request:

Header	Purpose
`X-Databend-Dsn`	Databend DSN, e.g. `databend://user:pass@host:8000/default`.
`X-Databend-Schema`	JSON document that tells the adapter which table/columns to treat as Loki timestamp/labels/line.

X-Databend-Schema accepts the fields listed below. schema_type must be loki or flat. All column names are case-insensitive and must match the physical Databend table. Set table to either db.table or just the table name (the latter uses the DSN's default database).

Loki schema example

{
  "table": "default.logs",
  "schema_type": "loki",
  "timestamp_column": "timestamp",
  "line_column": "line",
  "labels_column": "labels"
}

Flat schema example

{
  "table": "analytics.nginx_logs",
  "schema_type": "flat",
  "timestamp_column": "timestamp",
  "line_column": "request",
  "label_columns": [
    { "name": "host" },
    { "name": "status", "numeric": true },
    { "name": "client" }
  ]
}

label_columns is required for flat schemas and each entry can mark numeric: true to treat the label as a numeric column for selectors.

Schema Support

The adapter inspects the table via system.columns during startup and then maps the physical layout into Loki's timestamp/line/label model. Two schema styles are supported. The SQL snippets below are reference starting points rather than strict requirements -- feel free to rename columns, tweak indexes, or add computed fields as long as the final table exposes the required timestamp/line/label information. Use the CLI overrides (--timestamp-column, --line-column, --labels-column) if your column names differ.

Loki schema

Use this schema when you already store labels as a serialized structure (VARIANT/MAP) alongside the log body. The adapter expects a timestamp column, a VARIANT/MAP column containing a JSON object of labels, and a string column for the log line or payload. Additional helper columns (hashes, shards, etc.) are ignored.

Recommended layout (adjust column names, clustering keys, and indexes to match your workload):

CREATE TABLE logs (
  `timestamp` TIMESTAMP NOT NULL,
  `labels` VARIANT NOT NULL,
  `line` STRING NOT NULL,
  `stream_hash` UInt64 NOT NULL AS (city64withseed(labels, 0)) STORED
) CLUSTER BY (to_start_of_hour(timestamp), stream_hash);

CREATE INVERTED INDEX logs_line_idx ON logs(line);

timestamp: log event timestamp.
labels: VARIANT/MAP storing serialized Loki labels.
line: raw log line.
stream_hash: computed hash of the label set; useful for clustering or fast equality filters on a stream.
CREATE INVERTED INDEX: defined separately as required by Databend's inverted-index syntax.

Extra optimizations (optional but recommended, mix and match as needed):

Flat schema

Use this schema when logs arrive in a wide table where each attribute is already a separate column. The adapter chooses the timestamp column, picks one string column for the log line (either auto-detected or provided via --line-column), and treats every other column as a label. The examples below illustrate common shapes; substitute your own column names and indexes.

CREATE TABLE nginx_logs (
  `agent` STRING,
  `client` STRING,
  `host` STRING,
  `path` STRING,
  `protocol` STRING,
  `refer` STRING,
  `request` STRING,
  `size` INT,
  `status` INT,
  `timestamp` TIMESTAMP NOT NULL
) CLUSTER BY (to_start_of_hour(timestamp), host, status);

CREATE TABLE kubernetes_logs (
  `message` STRING,
  `log_time` TIMESTAMP NOT NULL,
  `pod_name` STRING,
  `pod_namespace` STRING,
  `cluster_name` STRING
) CLUSTER BY (to_start_of_hour(log_time), cluster_name, pod_namespace, pod_name);

CREATE INVERTED INDEX k8s_message_idx ON kubernetes_logs(message);

Guidelines:

If the table does not have an obvious log-line column, pass --line-column (e.g., --line-column request for nginx_logs, or --line-column message for kubernetes_logs). The column may be nullable; the adapter will emit empty strings when needed.
Every other column automatically becomes a LogQL label. These columns hold the actual metadata you want to query (client, host, status, pod_name, pod_namespace, cluster_name, etc.). Use Databend's SQL to rename or cast fields if you need canonical label names.
```
CREATE INVERTED INDEX nginx_request_idx ON nginx_logs(request);
CREATE INVERTED INDEX k8s_message_idx ON kubernetes_logs(message);
```

Index guidance

Databend only requires manual management for inverted indexes. See the official docs for inverted indexes and the dedicated CREATE INVERTED INDEX and REFRESH INVERTED INDEX statements. Bloom filter style pruning for MAP/VARIANT columns is built in, so you do not need to create standalone bloom filter or minmax indexes. Remember to refresh a newly created inverted index so historical data becomes searchable, e.g.:

REFRESH INVERTED INDEX logs_line_idx ON logs;

Metadata lookup

The adapter validates table shape with:

SELECT name, data_type
FROM system.columns
WHERE database = '<database>'
  AND table = '<table>'
ORDER BY name;

Ensure the table matches one of the schemas above (including indexes) so Grafana can issue LogQL queries directly against Databend through this adapter.

HTTP API

All endpoints return Loki-compatible JSON responses and reuse the same error shape that Loki expects (status:error, errorType, error). Grafana can therefore talk to the adapter using the stock Loki data source without any proxy layers or plugins. Refer to the upstream Loki HTTP API reference for the detailed contract of each endpoint.

Endpoint	Description
`GET /loki/api/v1/query`	Instant query. Supports the same LogQL used by Grafana's Explore panel. An optional `time` parameter (nanoseconds) defaults to "now", and the adapter automatically looks back 5 minutes when computing SQL bounds.
`GET /loki/api/v1/query_range`	Range query. Accepts `start`/`end` (default past hour), `since` (relative duration), `limit`, `interval`, `step`, and `direction`. Log queries stream raw lines (`interval` down-samples entries, `direction` controls scan order); metric queries return Loki matrix results and require a `step` value (the adapter may clamp it to keep bucket counts bounded, default cap 240 buckets).
`GET /loki/api/v1/labels`	Lists known label keys for the selected schema. Optional `start`/`end` parameters (nanoseconds) fence the search window; unspecified values default to the last 5 minutes, matching Grafana's Explore defaults.
`GET /loki/api/v1/label/{label}/values`	Lists distinct values for a specific label key using the same optional `start`/`end` bounds as `/labels`. Works for both `loki` and `flat` schemas and automatically filters out empty strings.
`GET /loki/api/v1/index/stats`	Returns approximate `streams`, `chunks`, `entries`, and `bytes` counters for a selector over a `[start, end]` window. `chunks` are estimated via unique stream keys because Databend does not store Loki chunks.
`GET /loki/api/v1/tail`	WebSocket tail endpoint that streams live logs for a LogQL query; compatible with Grafana Explore and `logcli --tail`.

/query and /query_range share the same LogQL parser and SQL builder. Instant queries fall back to DEFAULT_LOOKBACK_NS (5 minutes) when no explicit window is supplied, while range queries default to [now - 1h, now] and also honor Loki's since helper to derive start. /loki/api/v1/query_range log queries fully implement Loki's direction (forward/backward) and interval parameters: the adapter scans in the requested direction, emits entries in that order, and down-samples each stream so successive log lines are at least interval apart starting from start. /labels and /label/{label}/values delegate to schema-aware metadata lookups: the loki schema uses map_keys/labels['key'] expressions, whereas the flat schema issues SELECT DISTINCT on the physical column and returns values in sorted order.

Tail streaming

/loki/api/v1/tail upgrades to a WebSocket connection and sends frames that match Loki's native shape ({"streams":[...],"dropped_entries":[]}). Supported query parameters:

query: required LogQL selector.
limit: max number of entries per batch (default 100, still subject to the global MAX_LIMIT).
start: initial cursor in nanoseconds, defaults to "one hour ago".
delay_for: optional delay (seconds) that lets slow writers catch up; defaults to 0 and cannot exceed 5.

The adapter keeps a cursor and duplicate fingerprints so new rows are streamed in chronological order without repeats. Grafana Explore, logcli --tail, or any WebSocket client can connect directly.

Metric queries

The adapter currently supports a narrow LogQL metric surface area:

Range functions: count_over_time and rate. The latter reports per-second values (COUNT / window_seconds).
Optional outer aggregations: sum, avg, min, max, count, each with by (...). without or other modifiers return errorType:bad_data.
Pipelines: only drop stages are honored (labels are removed after aggregation to match Loki semantics). Any other stage still results in errorType:bad_data.
/loki/api/v1/query_range metric calls must provide step. When the requested (end - start) / step would exceed the configured bucket cap (default 240, tweak via --max-metric-buckets), the adapter automatically increases the effective step to keep the SQL result size manageable; the adapter never fans out multiple queries or aggregates in memory.
/loki/api/v1/query metric calls reuse the same expressions but evaluate them over [time - range, time].

Both schema adapters (loki VARIANT labels and flat wide tables) translate the metric expression into one SQL statement that joins generated buckets with the raw rows via generate_series, so all aggregation happens inside Databend. Non-metric queries continue to stream raw logs.

LogQL template functions

line_format and label_format now ship with a lightweight template engine that supports field interpolation ({{ .message }}) plus the full set of Grafana Loki template string functions. Supported functions are listed below:

Function	Status	Notes
`__line__`, `__timestamp__`, `now`	✅	Expose the raw line, the row timestamp, and the adapter host's current time.
`date`, `toDate`, `toDateInZone`	✅	Go-style datetime formatting and parsing (supports IANA zones).
`duration`, `duration_seconds`	✅	Parse Go duration strings into seconds (positive/negative).
`unixEpoch`, `unixEpochMillis`, `unixEpochNanos`, `unixToTime`	✅	Unix timestamp helpers.
`alignLeft`, `alignRight`	✅	Align field contents to a fixed width.
`b64enc`, `b64dec`	✅	Base64 encode/decode a field or literal.
`bytes`	✅	Parses human-readable byte strings (e.g. `2 KB` → `2000`).
`default`	✅	Provides a fallback when a field is empty or missing.
`fromJson`	⚠️	Validates and normalizes JSON strings (advanced loops like `range` remain unsupported).
`indent`, `nindent`	✅	Indent multi-line strings.
`lower`, `upper`, `title`	✅	Case conversion helpers.
`repeat`	✅	String repetition helper.
`printf`	✅	Supports `%s`, `%d`, `%f`, width/precision flags.
`replace`, `substr`, `trunc`	✅	String replacement, slicing, and truncation.
`trim`, `trimAll`, `trimPrefix`, `trimSuffix`	✅	String trimming helpers.
`urlencode`, `urldecode`	✅	URL encoding/decoding.
`contains`, `eq`, `hasPrefix`, `hasSuffix`	✅	Logical helpers for comparisons.
`int`, `float64`	✅	Cast values to integers/floats.
`add`, `addf`, `sub`, `subf`, `mul`, `mulf`, `div`, `divf`, `mod`	✅	Integer and floating-point arithmetic.
`ceil`, `floor`, `round`	✅	Floating-point rounding helpers.
`max`, `min`, `maxf`, `minf`	✅	Extremum helpers for integers/floats.
`count`	✅	Count regex matches (`{{ line count "foo" }}`).
`regexReplaceAll`, `regexReplaceAllLiteral`	✅	Regex replacement helpers (literal and capture-aware).

fromJson currently only validates and re-serializes JSON strings because the template engine has no looping constructs yet. For advanced constructs (e.g., range), preprocess data upstream or continue to rely on Grafana/Loki-native features until control flow support arrives.

Logging

By default the adapter configures env_logger with databend_loki_adapter at info level and every other module at warn. This keeps the startup flow visible without flooding the console with dependency logs. To override the levels, set RUST_LOG just like any other env_logger application, e.g.:

export RUST_LOG=databend_loki_adapter=debug,databend_driver=info

Testing

Run the Rust test suite with cargo nextest run.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
scripts		scripts
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Databend Loki Adapter

Getting Started

Configuration

Run Modes

Proxy headers

Loki schema example

Flat schema example

Schema Support

Loki schema

Flat schema

Index guidance

Metadata lookup

HTTP API

Tail streaming

Metric queries

LogQL template functions

Logging

Testing

About

Uh oh!

Releases 6

Packages

Uh oh!

Languages

License

databendlabs/databend-loki-adapter

Folders and files

Latest commit

History

Repository files navigation

Databend Loki Adapter

Getting Started

Configuration

Run Modes

Proxy headers

Loki schema example

Flat schema example

Schema Support

Loki schema

Flat schema

Index guidance

Metadata lookup

HTTP API

Tail streaming

Metric queries

LogQL template functions

Logging

Testing

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Languages

Packages