Databend Loki Adapter exposes a minimal Loki-compatible HTTP API. It parses LogQL queries from Grafana, converts them to Databend SQL, runs the statements, and returns Loki-formatted JSON responses.
export DATABEND_DSN="databend://user:pass@host:port/default"
databend-loki-adapter --table nginx_logs --schema-type flatThe adapter listens on --bind (default 0.0.0.0:3100) and exposes a minimal subset of the Loki HTTP surface area.
| Flag | Env | Default | Description |
|---|---|---|---|
--mode |
ADAPTER_MODE |
standalone |
standalone uses a fixed DSN/table, proxy pulls both from HTTP headers. |
--dsn |
DATABEND_DSN |
required | Databend DSN with credentials (proxy mode expects it via header). |
--table |
LOGS_TABLE |
logs |
Target table. Use db.table or rely on the DSN default database. |
--bind |
BIND_ADDR |
0.0.0.0:3100 |
HTTP bind address. |
--schema-type |
SCHEMA_TYPE |
loki |
loki (labels as VARIANT) or flat (wide table). |
--timestamp-column |
TIMESTAMP_COLUMN |
auto-detect | Override the timestamp column name. |
--line-column |
LINE_COLUMN |
auto-detect | Override the log line column name. |
--labels-column |
LABELS_COLUMN |
auto-detect (loki only) | Override the labels column name. |
--max-metric-buckets |
MAX_METRIC_BUCKETS |
240 |
Maximum bucket count per metric range query before clamping step. |
databend-loki-adapter runs in two modes:
- standalone (default): supply
--dsn,--table, and schema overrides on the CLI/env. The adapter resolves the table once at startup and caches the schema for the entire process lifetime. - proxy: launch the server with
--mode proxyand omit--dsn. Each HTTP request must pass the Databend DSN and schema definition in headers so multiple tenants/tables can share one adapter instance.
Proxy mode expects two headers on every request:
| Header | Purpose |
|---|---|
X-Databend-Dsn |
Databend DSN, e.g. databend://user:pass@host:8000/default. |
X-Databend-Schema |
JSON document that tells the adapter which table/columns to treat as Loki timestamp/labels/line. |
X-Databend-Schema accepts the fields listed below. schema_type must be loki or flat. All column names are case-insensitive and must match the physical Databend table. Set table to either db.table or just the table name (the latter uses the DSN's default database).
{
"table": "default.logs",
"schema_type": "loki",
"timestamp_column": "timestamp",
"line_column": "line",
"labels_column": "labels"
}{
"table": "analytics.nginx_logs",
"schema_type": "flat",
"timestamp_column": "timestamp",
"line_column": "request",
"label_columns": [
{ "name": "host" },
{ "name": "status", "numeric": true },
{ "name": "client" }
]
}label_columns is required for flat schemas and each entry can mark numeric: true to treat the label as a numeric column for selectors.
The adapter inspects the table via system.columns during startup and then maps the physical layout into Loki's timestamp/line/label model. Two schema styles are supported. The SQL snippets below are reference starting points rather than strict requirements -- feel free to rename columns, tweak indexes, or add computed fields as long as the final table exposes the required timestamp/line/label information. Use the CLI overrides (--timestamp-column, --line-column, --labels-column) if your column names differ.
Use this schema when you already store labels as a serialized structure (VARIANT/MAP) alongside the log body. The adapter expects a timestamp column, a VARIANT/MAP column containing a JSON object of labels, and a string column for the log line or payload. Additional helper columns (hashes, shards, etc.) are ignored.
Recommended layout (adjust column names, clustering keys, and indexes to match your workload):
CREATE TABLE logs (
`timestamp` TIMESTAMP NOT NULL,
`labels` VARIANT NOT NULL,
`line` STRING NOT NULL,
`stream_hash` UInt64 NOT NULL AS (city64withseed(labels, 0)) STORED
) CLUSTER BY (to_start_of_hour(timestamp), stream_hash);
CREATE INVERTED INDEX logs_line_idx ON logs(line);timestamp: log event timestamp.labels: VARIANT/MAP storing serialized Loki labels.line: raw log line.stream_hash: computed hash of the label set; useful for clustering or fast equality filters on a stream.CREATE INVERTED INDEX: defined separately as required by Databend's inverted-index syntax.
Extra optimizations (optional but recommended, mix and match as needed):
Use this schema when logs arrive in a wide table where each attribute is already a separate column. The adapter chooses the timestamp column, picks one string column for the log line (either auto-detected or provided via --line-column), and treats every other column as a label. The examples below illustrate common shapes; substitute your own column names and indexes.
CREATE TABLE nginx_logs (
`agent` STRING,
`client` STRING,
`host` STRING,
`path` STRING,
`protocol` STRING,
`refer` STRING,
`request` STRING,
`size` INT,
`status` INT,
`timestamp` TIMESTAMP NOT NULL
) CLUSTER BY (to_start_of_hour(timestamp), host, status);CREATE TABLE kubernetes_logs (
`message` STRING,
`log_time` TIMESTAMP NOT NULL,
`pod_name` STRING,
`pod_namespace` STRING,
`cluster_name` STRING
) CLUSTER BY (to_start_of_hour(log_time), cluster_name, pod_namespace, pod_name);
CREATE INVERTED INDEX k8s_message_idx ON kubernetes_logs(message);Guidelines:
-
If the table does not have an obvious log-line column, pass
--line-column(e.g.,--line-column requestfornginx_logs, or--line-column messageforkubernetes_logs). The column may be nullable; the adapter will emit empty strings when needed. -
Every other column automatically becomes a LogQL label. These columns hold the actual metadata you want to query (
client,host,status,pod_name,pod_namespace,cluster_name, etc.). Use Databend's SQL to rename or cast fields if you need canonical label names.CREATE INVERTED INDEX nginx_request_idx ON nginx_logs(request); CREATE INVERTED INDEX k8s_message_idx ON kubernetes_logs(message);
Databend only requires manual management for inverted indexes. See the official docs for inverted indexes and the dedicated CREATE INVERTED INDEX and REFRESH INVERTED INDEX statements. Bloom filter style pruning for MAP/VARIANT columns is built in, so you do not need to create standalone bloom filter or minmax indexes. Remember to refresh a newly created inverted index so historical data becomes searchable, e.g.:
REFRESH INVERTED INDEX logs_line_idx ON logs;The adapter validates table shape with:
SELECT name, data_type
FROM system.columns
WHERE database = '<database>'
AND table = '<table>'
ORDER BY name;Ensure the table matches one of the schemas above (including indexes) so Grafana can issue LogQL queries directly against Databend through this adapter.
All endpoints return Loki-compatible JSON responses and reuse the same error shape that Loki expects (status:error, errorType, error). Grafana can therefore talk to the adapter using the stock Loki data source without any proxy layers or plugins. Refer to the upstream Loki HTTP API reference for the detailed contract of each endpoint.
| Endpoint | Description |
|---|---|
GET /loki/api/v1/query |
Instant query. Supports the same LogQL used by Grafana's Explore panel. An optional time parameter (nanoseconds) defaults to "now", and the adapter automatically looks back 5 minutes when computing SQL bounds. |
GET /loki/api/v1/query_range |
Range query. Accepts start/end (default past hour), since (relative duration), limit, interval, step, and direction. Log queries stream raw lines (interval down-samples entries, direction controls scan order); metric queries return Loki matrix results and require a step value (the adapter may clamp it to keep bucket counts bounded, default cap 240 buckets). |
GET /loki/api/v1/labels |
Lists known label keys for the selected schema. Optional start/end parameters (nanoseconds) fence the search window; unspecified values default to the last 5 minutes, matching Grafana's Explore defaults. |
GET /loki/api/v1/label/{label}/values |
Lists distinct values for a specific label key using the same optional start/end bounds as /labels. Works for both loki and flat schemas and automatically filters out empty strings. |
GET /loki/api/v1/index/stats |
Returns approximate streams, chunks, entries, and bytes counters for a selector over a [start, end] window. chunks are estimated via unique stream keys because Databend does not store Loki chunks. |
GET /loki/api/v1/tail |
WebSocket tail endpoint that streams live logs for a LogQL query; compatible with Grafana Explore and logcli --tail. |
/query and /query_range share the same LogQL parser and SQL builder. Instant queries fall back to DEFAULT_LOOKBACK_NS (5 minutes) when no explicit window is supplied, while range queries default to [now - 1h, now] and also honor Loki's since helper to derive start. /loki/api/v1/query_range log queries fully implement Loki's direction (forward/backward) and interval parameters: the adapter scans in the requested direction, emits entries in that order, and down-samples each stream so successive log lines are at least interval apart starting from start. /labels and /label/{label}/values delegate to schema-aware metadata lookups: the loki schema uses map_keys/labels['key'] expressions, whereas the flat schema issues SELECT DISTINCT on the physical column and returns values in sorted order.
/loki/api/v1/tail upgrades to a WebSocket connection and sends frames that match Loki's native shape ({"streams":[...],"dropped_entries":[]}). Supported query parameters:
query: required LogQL selector.limit: max number of entries per batch (default 100, still subject to the globalMAX_LIMIT).start: initial cursor in nanoseconds, defaults to "one hour ago".delay_for: optional delay (seconds) that lets slow writers catch up; defaults to0and cannot exceed5.
The adapter keeps a cursor and duplicate fingerprints so new rows are streamed in chronological order without repeats. Grafana Explore, logcli --tail, or any WebSocket client can connect directly.
The adapter currently supports a narrow LogQL metric surface area:
- Range functions:
count_over_timeandrate. The latter reports per-second values (COUNT / window_seconds). - Optional outer aggregations:
sum,avg,min,max,count, each withby (...).withoutor other modifiers returnerrorType:bad_data. - Pipelines: only
dropstages are honored (labels are removed after aggregation to match Loki semantics). Any other stage still results inerrorType:bad_data. /loki/api/v1/query_rangemetric calls must providestep. When the requested(end - start) / stepwould exceed the configured bucket cap (default 240, tweak via--max-metric-buckets), the adapter automatically increases the effective step to keep the SQL result size manageable; the adapter never fans out multiple queries or aggregates in memory./loki/api/v1/querymetric calls reuse the same expressions but evaluate them over[time - range, time].
Both schema adapters (loki VARIANT labels and flat wide tables) translate the metric expression into one SQL statement that joins generated buckets with the raw rows via generate_series, so all aggregation happens inside Databend. Non-metric queries continue to stream raw logs.
line_format and label_format now ship with a lightweight template engine that supports field interpolation ({{ .message }}) plus the full set of Grafana Loki template string functions. Supported functions are listed below:
| Function | Status | Notes |
|---|---|---|
__line__, __timestamp__, now |
✅ | Expose the raw line, the row timestamp, and the adapter host's current time. |
date, toDate, toDateInZone |
✅ | Go-style datetime formatting and parsing (supports IANA zones). |
duration, duration_seconds |
✅ | Parse Go duration strings into seconds (positive/negative). |
unixEpoch, unixEpochMillis, unixEpochNanos, unixToTime |
✅ | Unix timestamp helpers. |
alignLeft, alignRight |
✅ | Align field contents to a fixed width. |
b64enc, b64dec |
✅ | Base64 encode/decode a field or literal. |
bytes |
✅ | Parses human-readable byte strings (e.g. 2 KB → 2000). |
default |
✅ | Provides a fallback when a field is empty or missing. |
fromJson |
Validates and normalizes JSON strings (advanced loops like range remain unsupported). |
|
indent, nindent |
✅ | Indent multi-line strings. |
lower, upper, title |
✅ | Case conversion helpers. |
repeat |
✅ | String repetition helper. |
printf |
✅ | Supports %s, %d, %f, width/precision flags. |
replace, substr, trunc |
✅ | String replacement, slicing, and truncation. |
trim, trimAll, trimPrefix, trimSuffix |
✅ | String trimming helpers. |
urlencode, urldecode |
✅ | URL encoding/decoding. |
contains, eq, hasPrefix, hasSuffix |
✅ | Logical helpers for comparisons. |
int, float64 |
✅ | Cast values to integers/floats. |
add, addf, sub, subf, mul, mulf, div, divf, mod |
✅ | Integer and floating-point arithmetic. |
ceil, floor, round |
✅ | Floating-point rounding helpers. |
max, min, maxf, minf |
✅ | Extremum helpers for integers/floats. |
count |
✅ | Count regex matches ({{ **line** count "foo" }}). |
regexReplaceAll, regexReplaceAllLiteral |
✅ | Regex replacement helpers (literal and capture-aware). |
fromJson currently only validates and re-serializes JSON strings because the template engine has no looping constructs yet. For advanced constructs (e.g., range), preprocess data upstream or continue to rely on Grafana/Loki-native features until control flow support arrives.
By default the adapter configures env_logger with databend_loki_adapter at info level and every other module at warn. This keeps the startup flow visible without flooding the console with dependency logs. To override the levels, set RUST_LOG just like any other env_logger application, e.g.:
export RUST_LOG=databend_loki_adapter=debug,databend_driver=infoRun the Rust test suite with cargo nextest run.