Skip to content

Conversation

@bkonkle
Copy link

@bkonkle bkonkle commented Nov 17, 2025

Summary

Adds an experimental OpenTelemetry (OTLP) observability pipeline to Turborepo for exporting run and task metrics to any OTLP-compatible collector.

Key characteristics:

  • Opt-in via turbo.json (gated behind futureFlags.experimentalObservability)
  • Env vars and CLI flags bypass the future flag, enabling experimentation without config changes
  • Extensible design allows adding non-OTEL backends later without breaking the config shape

Motivation

Turborepo already computes rich run summaries and per-task metadata (durations, cache status, SCM info), but these were only available locally via --dry=json or --summarize. This PR enables sending metrics to external collectors for long-term analysis, alerting, and correlation with other telemetry.


Architecture

Component Purpose
observability module Defines RunObserver trait and Handle abstraction for pluggable backends
turborepo-otel crate OTLP implementation wrapping opentelemetry + opentelemetry-otlp
ExperimentalObservabilityOptions Config wrapper with otel field (room for future backends)

Configuration

turbo.json (requires future flag)

{
  "futureFlags": { "experimentalObservability": true },
  "experimentalObservability": {
    "otel": {
      "enabled": true,
      "endpoint": "https://collector.example.com",
      "protocol": "grpc",           // or "http/protobuf"
      "headers": { "X-API-Key": "..." },
      "timeoutMs": 10000,
      "resource": { "service.name": "my-monorepo" },
      "metrics": { "runSummary": true, "taskDetails": false },
      "useRemoteCacheToken": true   // reuse remote cache auth
    }
  }
}

Environment Variables (no future flag required)

Variable Description
TURBO_EXPERIMENTAL_OTEL_ENABLED 1/0/true/false
TURBO_EXPERIMENTAL_OTEL_ENDPOINT Collector URL
TURBO_EXPERIMENTAL_OTEL_PROTOCOL grpc or http/protobuf
TURBO_EXPERIMENTAL_OTEL_TIMEOUT_MS Timeout in ms
TURBO_EXPERIMENTAL_OTEL_HEADERS Comma-separated key=value
TURBO_EXPERIMENTAL_OTEL_RESOURCE Comma-separated key=value
TURBO_EXPERIMENTAL_OTEL_METRICS_RUN_SUMMARY Enable run metrics
TURBO_EXPERIMENTAL_OTEL_METRICS_TASK_DETAILS Enable task metrics
TURBO_EXPERIMENTAL_OTEL_USE_REMOTE_CACHE_TOKEN Reuse cache auth

CLI Flags (no future flag required)

--experimental-otel-{enabled,endpoint,protocol,timeout-ms,header,resource,metrics-run-summary,metrics-task-details,use-remote-cache-token}


Metrics Exported

Run-level (metrics.runSummary, default: true):

  • Duration, task counts (attempted/failed/cached), exit code
  • SCM branch and revision
  • Attributes: turbo.run.id, turbo.version, turbo.scm.*

Task-level (metrics.taskDetails, default: false):

  • Execution duration, cache hit/miss, time saved, exit code
  • Attributes: turbo.task.{id,name,package,hash,command}

useRemoteCacheToken

When enabled, automatically adds Authorization: Bearer <token> using your existing remote cache credentials (turbo login or TURBO_TOKEN). Existing Authorization headers are preserved.


Run Lifecycle Integration

  1. RunBuilder reads opts.experimental_observability and calls Handle::try_init
  2. On run completion, RunSummary records metrics via the handle
  3. shutdown() flushes buffered data before exit
  4. Dry runs skip OTEL entirely

Failure Behavior

Scenario Behavior
Invalid config (bad protocol, malformed headers) Configuration error reported
Exporter init fails Warning logged, run continues without observability
Shutdown errors Logged, does not affect exit code
Feature not configured No network calls, no behavior change

Compile-time gating: The otel Cargo feature controls inclusion; builds without it treat observability config as a no-op.


Quick Start

  1. Point to an OTLP collector
  2. Enable via turbo.json + future flag, or use env vars/CLI flags directly
  3. Optionally enable useRemoteCacheToken to reuse cache auth
  4. Start with runSummary: true, add taskDetails when needed

@turbo-orchestrator turbo-orchestrator bot added the area: docs Improvements or additions to documentation label Nov 17, 2025
@vercel
Copy link
Contributor

vercel bot commented Nov 17, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
examples-basic-web Ready Ready Preview, Comment, Open in v0 Feb 3, 2026 8:49pm
examples-designsystem-docs Ready Ready Preview, Comment, Open in v0 Feb 3, 2026 8:49pm
examples-gatsby-web Ready Ready Preview, Comment, Open in v0 Feb 3, 2026 8:49pm
examples-kitchensink-blog Ready Ready Preview, Comment, Open in v0 Feb 3, 2026 8:49pm
examples-nonmonorepo Ready Ready Preview, Comment, Open in v0 Feb 3, 2026 8:49pm
examples-svelte-web Ready Ready Preview, Comment, Open in v0 Feb 3, 2026 8:49pm
examples-tailwind-web Ready Ready Preview, Comment, Open in v0 Feb 3, 2026 8:49pm
examples-vite-web Ready Ready Preview, Comment, Open in v0 Feb 3, 2026 8:49pm
turbo-site Ready Ready Preview, Comment, Open in v0 Feb 3, 2026 8:49pm
turborepo-test-coverage Ready Ready Preview, Comment, Open in v0 Feb 3, 2026 8:49pm

Copy link
Contributor

@vercel vercel bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 Build Fix:

Two code blocks are missing required title attributes. The documentation framework requires all code blocks to have titles. The blocks at line 789 (under experimentalObservability.otel.headers) and line 812 (under experimentalObservability.otel.resource) both use the opening marker ```jsonc without a title attribute, causing the build to fail.

View Details
📝 Patch Details
diff --git a/docs/site/content/docs/reference/configuration.mdx b/docs/site/content/docs/reference/configuration.mdx
index ccce2e05e..1cb37964c 100644
--- a/docs/site/content/docs/reference/configuration.mdx
+++ b/docs/site/content/docs/reference/configuration.mdx
@@ -786,7 +786,7 @@ The OTLP collector endpoint URL. For example:
 
 Optional HTTP headers to include with export requests. Useful for authentication (e.g., API keys) or custom metadata.
 
-```jsonc
+```jsonc title="./turbo.json"
 {
   "experimentalObservability": {
     "otel": {
@@ -809,7 +809,7 @@ Timeout in milliseconds for export requests to the collector.
 
 Optional resource attributes to attach to all exported metrics. These help identify the source of metrics in your observability platform.
 
-```jsonc
+```jsonc title="./turbo.json"
 {
   "experimentalObservability": {
     "otel": {

Analysis

Missing code block titles in documentation

What fails: Next.js build fails during static page generation for the /docs/reference/configuration page. The MDX renderer requires all code blocks to have title attributes.

How to reproduce:

cd docs/site
pnpm run build

Result before fix:

Error occurred prerendering page "/docs/reference/configuration"
Error: Code blocks must have titles. If you are creating a terminal, use "Terminal" for the title. Else, add a file path name.

Result after fix:

✓ Compiled successfully in 32.7s
[Build completes successfully with all 236 pages generated]

Root cause: Two code blocks in docs/site/content/docs/reference/configuration.mdx were missing the required title attribute:

  • Line 789: Code block under experimentalObservability.otel.headers
  • Line 812: Code block under experimentalObservability.otel.resource

Both blocks were changed from ```jsonc to ```jsonc title="./turbo.json" to match the pattern used throughout the documentation.

Fix on Vercel

@turbo-orchestrator turbo-orchestrator bot added the area: site Issues and improvements related to Turborepo's documentation website label Nov 17, 2025
@vercel vercel bot temporarily deployed to Preview – turborepo-test-coverage January 21, 2026 17:38 Inactive
@github-actions
Copy link
Contributor

github-actions bot commented Jan 21, 2026

Coverage Report

Metric Coverage
Lines 74.86%
Functions 72.30%
Branches 0.00%

View full report

The @turbo/repository test job was failing on macOS due to OOM during
Rust compilation. The combination of CARGO_BUILD_JOBS (default: num CPUs)
and -Zthreads=8 (parallel frontend) caused excessive memory usage.

Limit to 2 parallel crate compilations on macOS to reduce memory pressure.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
macOS ARM runners have limited memory (~7GB). Reduce rustc frontend
threads from 8 to 4 for the native library build to prevent OOM.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
The previous OOM fix attempts set RUSTFLAGS and CARGO_BUILD_JOBS in the
workflow, but turbo's --env-mode=strict was filtering them out because
they weren't in globalPassThroughEnv.

This adds both variables to globalPassThroughEnv in turbo.json and
combines both memory reduction strategies:
- CARGO_BUILD_JOBS=2: Limits parallel crate compilation
- RUSTFLAGS with -Zthreads=4: Reduces rustc frontend parallelism

Co-Authored-By: Claude Opus 4.5 <[email protected]>
GitHub Actions expressions like `${{ condition && 'value' || '' }}`
set empty strings on non-matching platforms, which breaks cargo:
- Empty CARGO_BUILD_JOBS causes "could not parse ''" error
- Empty RUSTFLAGS overrides .cargo/config.toml entirely

Use shell conditionals to only export these vars on macOS, leaving
them unset on other platforms.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Adding these to the task's `env` array (not just globalPassThroughEnv)
ensures they:
1. Are included in the task hash calculation
2. Bust the cached failure from the previous broken run
3. Properly invalidate cache when these values change

globalPassThroughEnv passes vars through but doesn't affect the hash.
The task-level env array is needed for proper cache invalidation.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@vercel vercel bot temporarily deployed to Preview – turborepo-test-coverage January 21, 2026 22:00 Inactive
package: task.package.clone(),
hash: task.shared.hash.clone(),
external_inputs_hash: task.shared.hash_of_external_dependencies.clone(),
command: task.shared.command.clone(),
Copy link
Contributor

@anthonyshew anthonyshew Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a minor risk here that sensitive values could be sent into a collector. Take, for instance, a command like:

turbo run build -- --some-value=$MY_SECRET_TOKEN

Have we checked what this would capture? The shell-injected string or the raw input? I think we need to constrain to the raw input.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout! I'll investigate.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like task.shared.command itself comes from the package.json, pulled from the scripts property. So, it should be the plain string and shouldn't be the interpolated value.

As for -- passthrough arguments, they go through this property instead (task.shared.cli_arguments), and that isn't currently captured in the observability/otel.rs module anywhere yet. I think if they were then this concern would apply - the shell would likely substitute before turbo saw it.

Comment on lines +284 to +289
attrs.push(KeyValue::new("turbo.task.hash", task.hash.clone()));
attrs.push(KeyValue::new(
"turbo.task.external_inputs_hash",
task.external_inputs_hash.clone(),
));
attrs.push(KeyValue::new("turbo.task.command", task.command.clone()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hashes are at an unbounded cardinality - and I'm not sure what analytical value they bring. Aggregations over these would generally be meaningless since there's no meaning in the hash value.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the same problem also exist for turbo.task.id?

turbo.scm.revision seems high cardinality but has a genuine use case of aggregating over a hash, right? turbo.scm.branch would be lower cardinality...But seems like users will want to aggregate over both of these.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use resource attributes or examplars meaningfully here? One consideration is that different OTEL backends handle these differently, so the helpfulness is a bit undefined.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry, captured the turbo.task.command line in this comment thread but didn't mean to - and now I've got a whole thread with myself going. 😄 Please ignore.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely a good thing to think through. I think yes, revision and branch are useful even with the high cardinality, but maybe a flag to turn them off if desired? The hash and external_inputs_hash and task.id attributes might be too random, though, and should be opt-in instead? 🤔

Comment on lines +488 to +497
// Warn if observability config is present but the feature flag is not enabled
if let Some(obs_opts) = &self.opts.experimental_observability {
if obs_opts.otel.is_some() && !self.opts.future_flags.experimental_observability {
tracing::warn!(
"experimentalObservability.otel is configured but \
futureFlags.experimentalObservability is not enabled in turbo.json. The \
observability config will be ignored."
);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The futureFlags are meant to be hard gates, not warnings. Easy change!

Copy link
Author

@bkonkle bkonkle Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some thinking around this last year and asked Claude to look at prior futureFlag usages, but I wasn't confident and thought you might call this out. 😄 This warn should be unreachable if the future flag is disabled because of the hard gate here), so the warning only applies when the config comes from CLI/env vars (which typically bypass the turbo.json gate since they're already prefixed with EXPERIMENTAL_), not from turbo.json.

Should I remove the warning anyway to eliminate confusion, or just add a note to the comments indicating when it would actually be used?

serde_json = { workspace = true, optional = true }
thiserror = { workspace = true }
tokio = { workspace = true, features = ["full"] }
tonic = "0.14"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultra-nit: This is a different version of tonic than is used in turborepo-lib. Would love to use a consistent version if we can. @anthonyshew, can you bump the tonic version in turborepo-lib forward to accommodate?

Copy link
Author

@bkonkle bkonkle Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anthonyshew Were you able to look into turborepo-lib, or should I start a new PR for it (as a prerequisite for this one)?

};

let reader = periodic_reader_with_async_runtime::PeriodicReader::builder(exporter, Tokio)
.with_interval(Duration::from_secs(15))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hardcoded - which might be fine. Any reason we should make it configurable?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how often people would actually configure it, but since timeout_ms is already configurable this probably should be also. 👍 I'll update it.

Comment on lines +485 to +490
# macOS ARM runners have limited memory (~7GB). Limit parallel crate
# compilation and reduce rustc frontend threads to prevent OOM.
if [ "${{ matrix.os.name }}" == "macos" ]; then
export CARGO_BUILD_JOBS=2
export RUSTFLAGS='--cfg tokio_unstable -Zshare-generics=y -Zthreads=4 -Csymbol-mangling-version=v0'
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to believe that this change is not required. 😄

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, it was consistently failing for me until I iterated to this solution. 😭 I'm glad to remove the workaround and see if you can spot something I missed? 😅

@turbo-orchestrator turbo-orchestrator bot removed area: docs Improvements or additions to documentation area: site Issues and improvements related to Turborepo's documentation website labels Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: ci Internal CI for vercel/turborepo area: logging Improvements to logging pkg: turbo-repository

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants