|
| 1 | +# Tracing & Metrics Configuration Guide |
| 2 | + |
| 3 | +Oxen ships with structured logging (via the Rust `tracing` crate) and Prometheus-compatible |
| 4 | +metrics (via the `metrics` crate). Both are wired into the CLI (`oxen`) and server |
| 5 | +(`oxen-server`) binaries out of the box. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## Overview |
| 10 | + |
| 11 | +| Concern | Crate | How it works | |
| 12 | +|---------|-------|-------------| |
| 13 | +| **Structured logging** | `tracing` + `tracing-subscriber` + `tracing-log` | A `tracing-log` bridge captures all legacy `log::` calls and routes them through `tracing`. Two subscriber layers are always registered: a human-readable **stderr** layer, and an optional **JSON file** layer. | |
| 14 | +| **Metrics** | `metrics` + `metrics-exporter-prometheus` | All counters and histograms use the `metrics::counter!` / `metrics::histogram!` macros. In the server a Prometheus HTTP exporter is installed automatically. In the CLI the default no-op recorder silently discards all metric calls (zero overhead). | |
| 15 | + |
| 16 | +There are ~580 metric instrumentation points across ~120 source files covering the CLI, |
| 17 | +lib (repositories, API client, merkle tree), and server controllers. |
| 18 | + |
| 19 | +--- |
| 20 | + |
| 21 | +## Environment Variables Reference |
| 22 | + |
| 23 | +| Variable | Default | Applies to | Description | |
| 24 | +|----------|---------|-----------|-------------| |
| 25 | +| `RUST_LOG` | `info` | CLI & Server | Filter directive for `tracing` (standard `EnvFilter` syntax). Controls what appears on stderr **and** in log files. | |
| 26 | +| `OXEN_LOG_FILE` | *(unset)* | CLI & Server | Opt-in to JSON file logging. Set to `1` / `true` for the default directory (`~/.oxen/logs/`), or set to a directory path. | |
| 27 | +| `OXEN_METRICS_PORT` | `9090` | Server only | TCP port for the Prometheus metrics HTTP endpoint (`/metrics`). | |
| 28 | + |
| 29 | +--- |
| 30 | + |
| 31 | +## Use Case 1: Stderr Only (Default) |
| 32 | + |
| 33 | +This is the zero-configuration default. All log output goes to stderr in a |
| 34 | +human-readable format. |
| 35 | + |
| 36 | +### CLI |
| 37 | + |
| 38 | +```bash |
| 39 | +# Run any oxen command — logs appear on stderr at `info` level |
| 40 | +oxen status |
| 41 | + |
| 42 | +# Increase verbosity to see debug-level output from liboxen |
| 43 | +RUST_LOG=debug oxen add . |
| 44 | + |
| 45 | +# Show only warnings and errors |
| 46 | +RUST_LOG=warn oxen push origin main |
| 47 | + |
| 48 | +# Target a specific module |
| 49 | +RUST_LOG="warn,liboxen::repositories::add=debug" oxen add . |
| 50 | +``` |
| 51 | + |
| 52 | +### Server |
| 53 | + |
| 54 | +```bash |
| 55 | +# Start the server — info-level logs on stderr by default |
| 56 | +oxen-server start |
| 57 | + |
| 58 | +# Debug-level logging for everything |
| 59 | +RUST_LOG=debug oxen-server start |
| 60 | + |
| 61 | +# Debug only for server controllers, warn for everything else |
| 62 | +RUST_LOG="warn,oxen_server::controllers=debug" oxen-server start |
| 63 | +``` |
| 64 | + |
| 65 | +### Sample stderr output |
| 66 | + |
| 67 | +``` |
| 68 | +2026-03-09T12:00:01.234Z INFO liboxen::repositories::add: staging files path="images/" |
| 69 | +2026-03-09T12:00:01.567Z DEBUG liboxen::core::v_latest::index::commit_merkle_tree: loading merkle tree commit_id="abc123" |
| 70 | +2026-03-09T12:00:02.890Z INFO liboxen::repositories::commits: committed message="add training data" |
| 71 | +``` |
| 72 | + |
| 73 | +### Filter syntax |
| 74 | + |
| 75 | +`RUST_LOG` uses the standard `EnvFilter` syntax: |
| 76 | + |
| 77 | +| Example | Meaning | |
| 78 | +|---------|---------| |
| 79 | +| `debug` | Everything at debug and above | |
| 80 | +| `warn,liboxen=debug` | Default warn, but debug for `liboxen` | |
| 81 | +| `info,liboxen::repositories=trace` | Trace-level for repo ops, info elsewhere | |
| 82 | +| `error` | Only errors | |
| 83 | + |
| 84 | +--- |
| 85 | + |
| 86 | +## Use Case 2: JSON File Logging |
| 87 | + |
| 88 | +Opt in by setting `OXEN_LOG_FILE`. A rolling daily log file is written in |
| 89 | +JSON-per-line format using `tracing-appender`. Stderr output continues as normal. |
| 90 | + |
| 91 | +### Enable with default directory |
| 92 | + |
| 93 | +```bash |
| 94 | +# Logs written to ~/.oxen/logs/oxen.YYYY-MM-DD |
| 95 | +OXEN_LOG_FILE=1 oxen push origin main |
| 96 | + |
| 97 | +# Same for the server — files named oxen-server.YYYY-MM-DD |
| 98 | +OXEN_LOG_FILE=true oxen-server start |
| 99 | +``` |
| 100 | + |
| 101 | +### Enable with custom directory |
| 102 | + |
| 103 | +```bash |
| 104 | +OXEN_LOG_FILE=/var/log/oxen oxen-server start |
| 105 | +# Logs written to /var/log/oxen/oxen-server.YYYY-MM-DD |
| 106 | +``` |
| 107 | + |
| 108 | +The directory is created automatically if it does not exist. |
| 109 | + |
| 110 | +### Sample JSON output |
| 111 | + |
| 112 | +Each line is a self-contained JSON object: |
| 113 | + |
| 114 | +```json |
| 115 | +{ |
| 116 | + "timestamp": "2026-03-09T12:00:01.234567Z", |
| 117 | + "level": "INFO", |
| 118 | + "target": "liboxen::repositories::add", |
| 119 | + "filename": "src/lib/src/repositories/add.rs", |
| 120 | + "line_number": 38, |
| 121 | + "threadId": 1, |
| 122 | + "fields": { |
| 123 | + "message": "staging files", |
| 124 | + "path": "images/" |
| 125 | + }, |
| 126 | + "span": { |
| 127 | + "name": "add" |
| 128 | + } |
| 129 | +} |
| 130 | +``` |
| 131 | + |
| 132 | +### Inspecting logs with jq |
| 133 | + |
| 134 | +```bash |
| 135 | +# Stream all errors from today's log |
| 136 | +cat ~/.oxen/logs/oxen-server.2026-03-09 | jq 'select(.level == "ERROR")' |
| 137 | + |
| 138 | +# Count log entries per level |
| 139 | +cat ~/.oxen/logs/oxen-server.2026-03-09 | jq -r '.level' | sort | uniq -c |
| 140 | + |
| 141 | +# Show all entries from a specific module |
| 142 | +cat ~/.oxen/logs/oxen-server.2026-03-09 | jq 'select(.target | startswith("liboxen::repositories::push"))' |
| 143 | + |
| 144 | +# Extract timing histograms recorded in spans |
| 145 | +cat ~/.oxen/logs/oxen-server.2026-03-09 | jq 'select(.fields.message | contains("duration"))' |
| 146 | +``` |
| 147 | + |
| 148 | +### Log rotation |
| 149 | + |
| 150 | +Files roll daily by filename suffix (e.g. `oxen-server.2026-03-09`, |
| 151 | +`oxen-server.2026-03-10`). Old files are **not** automatically deleted — set up |
| 152 | +your own rotation policy (e.g. `logrotate`, a cron job, or manual cleanup). |
| 153 | + |
| 154 | +--- |
| 155 | + |
| 156 | +## Use Case 3: Prometheus Metrics (Server) |
| 157 | + |
| 158 | +The server automatically starts a Prometheus HTTP exporter on startup. No |
| 159 | +additional configuration is needed for the default port. |
| 160 | + |
| 161 | +### Default (port 9090) |
| 162 | + |
| 163 | +```bash |
| 164 | +oxen-server start |
| 165 | +# Metrics available at http://localhost:9090/metrics |
| 166 | +``` |
| 167 | + |
| 168 | +### Custom port |
| 169 | + |
| 170 | +```bash |
| 171 | +OXEN_METRICS_PORT=9191 oxen-server start |
| 172 | +# Metrics available at http://localhost:9191/metrics |
| 173 | +``` |
| 174 | + |
| 175 | +### Verify with curl |
| 176 | + |
| 177 | +```bash |
| 178 | +curl -s http://localhost:9090/metrics | head -20 |
| 179 | +``` |
| 180 | + |
| 181 | +### Sample metric output |
| 182 | + |
| 183 | +``` |
| 184 | +# HELP oxen_server_commits_upload_total oxen_server_commits_upload_total |
| 185 | +# TYPE oxen_server_commits_upload_total counter |
| 186 | +oxen_server_commits_upload_total 42 |
| 187 | +
|
| 188 | +# HELP oxen_server_commits_upload_duration_seconds oxen_server_commits_upload_duration_seconds |
| 189 | +# TYPE oxen_server_commits_upload_duration_seconds summary |
| 190 | +oxen_server_commits_upload_duration_seconds{quantile="0.5"} 1.234 |
| 191 | +oxen_server_commits_upload_duration_seconds{quantile="0.9"} 3.456 |
| 192 | +oxen_server_commits_upload_duration_seconds{quantile="0.99"} 7.890 |
| 193 | +oxen_server_commits_upload_duration_seconds_sum 123.456 |
| 194 | +oxen_server_commits_upload_duration_seconds_count 42 |
| 195 | +
|
| 196 | +# HELP oxen_errors_total oxen_errors_total |
| 197 | +# TYPE oxen_errors_total counter |
| 198 | +oxen_errors_total{module="repositories::push",error="RemoteBranchNotFound"} 3 |
| 199 | +``` |
| 200 | + |
| 201 | +### Prometheus scrape config |
| 202 | + |
| 203 | +Add a scrape target to your `prometheus.yml`: |
| 204 | + |
| 205 | +```yaml |
| 206 | +scrape_configs: |
| 207 | + - job_name: "oxen-server" |
| 208 | + scrape_interval: 15s |
| 209 | + static_configs: |
| 210 | + - targets: ["localhost:9090"] |
| 211 | +``` |
| 212 | +
|
| 213 | +--- |
| 214 | +
|
| 215 | +## Metrics Without a Collection Endpoint (CLI) |
| 216 | +
|
| 217 | +The CLI calls `init_metrics_noop()` at startup. The Rust `metrics` crate's |
| 218 | +default behavior when no recorder is installed is to silently discard all |
| 219 | +counter/histogram calls. This means: |
| 220 | + |
| 221 | +- **Zero runtime overhead** — macro calls compile to no-ops when no recorder exists. |
| 222 | +- **No port is opened** — the CLI never listens on a network socket for metrics. |
| 223 | +- All the same `metrics::counter!` / `metrics::histogram!` instrumentation in |
| 224 | + the shared `liboxen` library simply does nothing when invoked from the CLI. |
| 225 | + |
| 226 | +There is nothing to configure for this case. |
| 227 | + |
| 228 | +--- |
| 229 | + |
| 230 | +## Metric Naming Reference |
| 231 | + |
| 232 | +All metrics follow the pattern `oxen_{component}_{operation}_total` (counters) |
| 233 | +or `oxen_{component}_{operation}_duration_seconds` (histograms). |
| 234 | + |
| 235 | +### Counters (`_total`) |
| 236 | + |
| 237 | +| Prefix | Component | Examples | |
| 238 | +|--------|-----------|---------| |
| 239 | +| `oxen_cli_*` | CLI commands | `oxen_cli_branch_total`, `oxen_cli_delete_remote_total` | |
| 240 | +| `oxen_repo_init_*` | Repository init | `oxen_repo_init_init_total`, `oxen_repo_init_init_with_version_total` | |
| 241 | +| `oxen_repo_commit_*` | Commits | `oxen_repo_commit_commit_total`, `oxen_repo_commit_head_commit_total`, `oxen_repo_commit_list_total` | |
| 242 | +| `oxen_repo_clone_*` | Clone | `oxen_repo_clone_clone_total`, `oxen_repo_clone_deep_clone_url_total` | |
| 243 | +| `oxen_repo_checkout_*` | Checkout | `oxen_repo_checkout_checkout_total`, `oxen_repo_checkout_checkout_theirs_total` | |
| 244 | +| `oxen_repo_status_*` | Status | `oxen_repo_status_status_total`, `oxen_repo_status_status_from_opts_total` | |
| 245 | +| `oxen_repo_diff_*` | Diffs | `oxen_repo_diff_diff_total`, `oxen_repo_diff_diff_commits_total`, `oxen_repo_diff_tabular_total` | |
| 246 | +| `oxen_repo_add_*` | Staging | `oxen_repo_add_add_total` | |
| 247 | +| `oxen_repo_push_*` | Push | `oxen_repo_push_push_total` | |
| 248 | +| `oxen_repo_pull_*` | Pull | `oxen_repo_pull_pull_total` | |
| 249 | +| `oxen_repo_merge_*` | Merge | `oxen_repo_merge_merge_total` | |
| 250 | +| `oxen_repo_entries_*` | Entry listing | `oxen_repo_entries_get_directory_total`, `oxen_repo_entries_list_directory_total` | |
| 251 | +| `oxen_repo_tree_*` | Tree operations | `oxen_repo_tree_get_root_total` | |
| 252 | +| `oxen_merkle_*` | Merkle tree internals | `oxen_merkle_from_commit_total`, `oxen_merkle_load_children_total` | |
| 253 | +| `oxen_client_*` | API client (HTTP calls) | `oxen_client_commits_post_push_complete_total`, `oxen_client_entries_download_entry_total` | |
| 254 | +| `oxen_server_commits_*` | Server commit endpoints | `oxen_server_commits_upload_total`, `oxen_server_commits_create_total` | |
| 255 | +| `oxen_server_repositories_*` | Server repo endpoints | `oxen_server_repositories_create_total`, `oxen_server_repositories_delete_total` | |
| 256 | +| `oxen_server_branches_*` | Server branch endpoints | `oxen_server_branches_index_total`, `oxen_server_branches_create_total` | |
| 257 | +| `oxen_server_workspaces_*` | Server workspace endpoints | `oxen_server_workspaces_create_total`, `oxen_server_workspaces_commit_total` | |
| 258 | +| `oxen_server_tree_*` | Server tree endpoints | `oxen_server_tree_create_nodes_total`, `oxen_server_tree_download_tree_total` | |
| 259 | +| `oxen_server_diff_*` | Server diff endpoints | `oxen_server_diff_commits_total`, `oxen_server_diff_file_total` | |
| 260 | +| `oxen_server_file_*` | Server file endpoints | `oxen_server_file_get_total`, `oxen_server_file_put_total` | |
| 261 | +| `oxen_server_merger_*` | Server merge endpoint | `oxen_server_merger_merge_total` | |
| 262 | +| `oxen_errors_total` | Error tracking | Labels: `module`, `error` — e.g. `oxen_errors_total{module="repositories::push",error="RemoteBranchNotFound"}` | |
| 263 | + |
| 264 | +### Histograms (`_duration_seconds`) |
| 265 | + |
| 266 | +| Metric | What it measures | |
| 267 | +|--------|-----------------| |
| 268 | +| `oxen_repo_add_add_duration_seconds` | Time to stage files | |
| 269 | +| `oxen_repo_commit_commit_duration_seconds` | Time to create a commit | |
| 270 | +| `oxen_repo_clone_clone_duration_seconds` | Time to clone a repository | |
| 271 | +| `oxen_repo_checkout_checkout_duration_seconds` | Time to checkout a branch/commit | |
| 272 | +| `oxen_repo_push_push_duration_seconds` | Time to push to remote | |
| 273 | +| `oxen_repo_pull_pull_duration_seconds` | Time to pull from remote | |
| 274 | +| `oxen_repo_merge_merge_duration_seconds` | Time to merge branches | |
| 275 | +| `oxen_repo_tree_*_duration_seconds` | Various tree operations (get root, compress, unpack, write) | |
| 276 | +| `oxen_merkle_from_commit_duration_seconds` | Time to build merkle tree from commit | |
| 277 | +| `oxen_client_import_upload_zip_duration_seconds` | Time to upload a zip import | |
| 278 | +| `oxen_client_versions_*_duration_seconds` | Upload/download version file durations | |
| 279 | +| `oxen_client_workspaces_files_*_duration_seconds` | Workspace file staging/download durations | |
| 280 | +| `oxen_server_commits_upload_duration_seconds` | Time to process a commit upload | |
| 281 | +| `oxen_server_commits_upload_tree_duration_seconds` | Time to upload tree data | |
| 282 | +| `oxen_server_file_put_duration_seconds` | Time to write a file via API | |
| 283 | +| `oxen_server_merger_merge_duration_seconds` | Time to process a server-side merge | |
| 284 | +| `oxen_server_workspaces_commit_duration_seconds` | Time to commit a workspace | |
| 285 | +| `oxen_server_prune_prune_duration_seconds` | Time to prune a repository | |
| 286 | +| `oxen_server_versions_*_duration_seconds` | Version upload/chunk durations | |
| 287 | +| `oxen_server_export_download_zip_duration_seconds` | Time to export a zip download | |
| 288 | +| `oxen_server_branches_maybe_create_merge_duration_seconds` | Time to create a merge branch | |
| 289 | +| `oxen_server_import_*_duration_seconds` | Import processing durations | |
| 290 | + |
| 291 | +--- |
| 292 | + |
| 293 | +## Troubleshooting |
| 294 | + |
| 295 | +### No log output appears |
| 296 | + |
| 297 | +- Ensure `RUST_LOG` is set to a level that includes the messages you expect |
| 298 | + (e.g. `debug` or `info`). The default is `info`. |
| 299 | +- Logs go to **stderr**, not stdout. If piping output, make sure you're |
| 300 | + capturing stderr: `oxen status 2>log.txt` |
| 301 | + |
| 302 | +### JSON log files are empty |
| 303 | + |
| 304 | +- Verify `OXEN_LOG_FILE` is set **before** the process starts (not after). |
| 305 | +- Check the directory exists and is writable. The directory is auto-created, but |
| 306 | + parent directories must already exist. |
| 307 | +- The non-blocking writer flushes on process exit. If the process is killed with |
| 308 | + `SIGKILL`, the last few log lines may be lost. Use `SIGTERM` for graceful shutdown. |
| 309 | + |
| 310 | +### Metrics endpoint returns nothing / connection refused |
| 311 | + |
| 312 | +- The Prometheus exporter only runs in `oxen-server`, not the CLI. |
| 313 | +- Verify the port: `curl -s http://localhost:9090/metrics`. If you changed the |
| 314 | + port via `OXEN_METRICS_PORT`, use that port instead. |
| 315 | +- Check for port conflicts — another process may already be listening on that port. |
| 316 | + Use `lsof -i :9090` to check. |
| 317 | + |
| 318 | +### Metrics show all zeros |
| 319 | + |
| 320 | +- Counters start at zero and only increment when the corresponding code path |
| 321 | + executes. If no requests have been made, counters will be zero. |
| 322 | +- Histograms only appear after the first observation is recorded. |
| 323 | + |
| 324 | +### Too much log output / performance concerns |
| 325 | + |
| 326 | +- Use a targeted filter: `RUST_LOG="warn,liboxen::repositories=info"` to limit |
| 327 | + verbosity to specific modules. |
| 328 | +- File logging uses a non-blocking writer with an in-memory buffer, so it has |
| 329 | + minimal impact on request latency. |
| 330 | +- Metrics collection via the `metrics` crate uses atomic operations and has |
| 331 | + negligible overhead. |
0 commit comments