-
Notifications
You must be signed in to change notification settings - Fork 117
Opentelemetry blog post #786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
shikokuchuo
wants to merge
12
commits into
main
Choose a base branch
from
opentelemetry
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
b12b04b
Opentelemetry blog post
shikokuchuo 8b19eba
More focused title
shikokuchuo 624ec28
Replace transparent image background
shikokuchuo b18423e
Use aurora thumbnails
shikokuchuo 5772457
Add how to set env vars on Posit Connect link
shikokuchuo 3d84545
More robust thumbnails script
shikokuchuo c5229e5
Apply suggestions from code review
shikokuchuo 46b6f47
Apply suggestions from @gaborcsardi
shikokuchuo d3a16f6
Expand description
shikokuchuo cd3fe9d
Content-aware image cropping
shikokuchuo 0b97c90
Revised thumbnails
shikokuchuo f038b1e
Add Posit Connect feature
shikokuchuo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,192 @@ | ||
| --- | ||
| output: hugodown::hugo_document | ||
| slug: opentelemetry | ||
| title: "Bringing OpenTelemetry to R in production" | ||
| date: 2026-02-16 | ||
| author: Charlie Gao, Aaron Jacobs, Barret Schloerke, Gábor Csárdi | ||
| description: > | ||
| Posit has instrumented Shiny, plumber2, mirai, httr2, ellmer, knitr, | ||
| testthat, and DBI with OpenTelemetry, and created tools for you to | ||
| instrument your own packages, bringing production-grade observability to R. | ||
| photo: | ||
| url: https://unsplash.com/photos/LtnPejWDSAY | ||
| author: Lightscape | ||
| categories: [other] | ||
| tags: [otel, shiny, plumber2, mirai, httr2, ellmer, knitr, testthat, DBI] | ||
| --- | ||
|
|
||
| <!-- | ||
| TODO: | ||
| * [x] Look over / edit the post's title in the yaml | ||
| * [x] Edit (or delete) the description; note this appears in the Twitter card | ||
| * [x] Pick category and tags (see existing with [`hugodown::tidy_show_meta()`](https://rdrr.io/pkg/hugodown/man/use_tidy_post.html)) | ||
| * [x] Find photo & update yaml metadata | ||
| * [x] Create `thumbnail-sq.jpg`; height and width should be equal | ||
| * [x] Create `thumbnail-wd.jpg`; width should be >5x height | ||
| * [x] [`hugodown::use_tidy_thumbnails()`](https://rdrr.io/pkg/hugodown/man/use_tidy_post.html) | ||
| * [x] Add intro sentence, e.g. the standard tagline for the package | ||
| * [ ] [`usethis::use_tidy_thanks()`](https://usethis.r-lib.org/reference/use_tidy_thanks.html) | ||
| --> | ||
|
|
||
| We're bringing [OpenTelemetry](https://opentelemetry.io/) to R. As a Posit-wide initiative across our open source packages, we've instrumented some of the most widely-used R packages for production workloads -- [Shiny](https://shiny.posit.co/), [plumber2](https://plumber2.posit.co/), [mirai](https://mirai.r-lib.org), [httr2](https://httr2.r-lib.org), [ellmer](https://ellmer.tidyverse.org), [knitr](https://yihui.org/knitr/), [testthat](https://testthat.r-lib.org), and [DBI](https://dbi.r-lib.org) -- so that you can add observability to your R applications with **no code changes**. Set a few environment variables and you get traces, logs, and metrics flowing to the backend of your choice. | ||
|
|
||
| This is part of our commitment to R in production. As R applications scale -- more users, more processes, more machines -- you need tools to understand what's happening across your entire system. OpenTelemetry is that tool, and it's now available to the R community. | ||
|
|
||
| ## What is OpenTelemetry? | ||
|
|
||
| [OpenTelemetry](https://opentelemetry.io/) (OTel) is a vendor-neutral, open source observability framework backed by the [Cloud Native Computing Foundation](https://www.cncf.io/). | ||
| It has broad industry support across languages and platforms, and is already the standard in the Python, Java, JavaScript, and Go ecosystems. Now it's available for R. | ||
|
|
||
| OpenTelemetry defines a standard for collecting telemetry data: | ||
|
|
||
| - **Traces** follow an operation as it moves through your system, showing exactly which functions ran, in what order, and how long each took. | ||
| - **Metrics** capture numerical measurements over time -- things like request counts, response latencies, or memory usage. | ||
| - **Logs** record detailed events as they happen, providing context when you need to investigate a specific moment. | ||
|
|
||
| The instrumented packages described in this post all use the [otel](https://otel.r-lib.org) package under the hood, and focus on traces, which provide the most immediate value for understanding production behavior. You can also use otel directly to add your own metrics and logs to your application code. | ||
|
|
||
| ## Why observability matters for R in production | ||
|
|
||
| When you're developing interactively in RStudio or Positron, debugging is straightforward -- you can step through code, inspect objects, and add print statements. But when your R code runs in production -- a Shiny app serving hundreds of users, a plumber2 API handling thousands of requests, a batch pipeline running across a cluster -- the picture changes. | ||
|
|
||
| The core concept in OpenTelemetry is a **trace**: the full path of a request through your system. Each trace is made up of **spans** -- individual units of work with a name and a duration. Spans nest inside each other, so you can see not just *what* happened, but *how* each operation led to the next: | ||
|
|
||
|  | ||
|
|
||
| This structure gives you four things that are hard to get any other way: | ||
|
|
||
| - **Performance**: Which part of a request is slow? Span durations pinpoint exactly where time is spent -- and nesting reveals unnecessary overhead. | ||
| - **Errors**: In development and testing, you know where errors are -- you wrote the test, you control the inputs. In production, errors surface far from their root cause, across process boundaries and async operations, triggered by conditions you never anticipated. Traces show you the full chain of real operations that led to each failure, in the context where it actually happened. | ||
| - **Centralized view**: When your application extends across multiple R processes or machines -- a Shiny app with mirai workers, or a plumber2 API behind a load balancer -- traces are aggregated into a single view across all of them. | ||
| - **Real-time monitoring**: OTel is designed to be left on in production, not just enabled during testing or staging. With low overhead and built-in safety guarantees, it runs continuously so you see what's happening right now, not after the fact. Dashboards and alerts built on telemetry data let you catch problems as they emerge, not when users report them. | ||
|
|
||
| ## Instrumented packages | ||
|
|
||
| We've worked across teams to add OpenTelemetry instrumentation to the R packages where it matters most: | ||
|
|
||
| | Package | Version | What it traces | | ||
| |:--------|:--------|:---------------| | ||
| | [Shiny](https://shiny.posit.co/) | ≥ 1.12.0 | Session lifecycle, reactive updates, reactive expressions, background tasks | | ||
| | [plumber2](https://plumber2.posit.co/) | ≥ 0.2.0 | API request handling, routing, endpoint execution | | ||
| | [mirai](https://mirai.r-lib.org) | ≥ 2.5.0 | Task dispatch, daemon execution, results | | ||
| | [httr2](https://httr2.r-lib.org) | ≥ 1.2.2 | HTTP requests and responses | | ||
| | [ellmer](https://ellmer.tidyverse.org) | ≥ 0.5.0 | LLM API calls, tool execution, token usage | | ||
| | [knitr](https://yihui.org/knitr/) | ≥ 1.51 | Document rendering, chunk evaluation | | ||
| | [testthat](https://testthat.r-lib.org) | ≥ 3.3.2 | Test execution | | ||
| | [DBI](https://dbi.r-lib.org) | ≥ 1.3.0 | Database queries and connections | | ||
|
|
||
| Together, these packages cover the most common production R workloads: web applications, APIs, parallel computing, HTTP clients, AI/LLM tools, report rendering pipelines, CI test runs, and database access. Because the instrumentation is built into the packages themselves, you benefit from it automatically -- no wrapper functions, no manual modifications to existing code, no extra work on your part. If you already have a Shiny app or plumber2 API, your app will generate traces as soon as you enable OpenTelemetry. Your application code stays exactly as it is. | ||
|
|
||
| ## Seeing it in action | ||
|
|
||
| To make this concrete, let's look at a Shiny chat app built with [shinychat](https://github.com/posit-dev/shinychat) and [ellmer](https://ellmer.tidyverse.org) that fetches weather forecasts. It uses mirai for async execution and httr2 for weather API requests. | ||
|
|
||
|  | ||
|
|
||
| A user asks about the weather in Atlanta and Newcastle: | ||
|
|
||
|  | ||
|
|
||
| With OpenTelemetry enabled, every step is captured automatically. Here are the traces from those two queries: | ||
|
|
||
|  | ||
|
|
||
| The trace reveals the full chain of operations from user input through to the HTTP request, across process boundaries, with no manual logging. The nesting shows how each step triggered the next, and the durations show exactly where time was spent. | ||
|
|
||
| The second query failed. Without tracing, you'd see an error in your logs and start investigating. Here, the failure is immediately visible -- the red span pinpoints where it occurred and the surrounding context shows why. In a production system with many concurrent users, that's the difference between minutes and seconds of debugging. | ||
|
|
||
| ## Getting started | ||
|
|
||
| Getting started requires the otelsdk package and a few environment variables. No changes to your application code. | ||
|
|
||
| Here's how the pieces fit together: the instrumented R packages generate telemetry data as they run. The [otelsdk](https://otelsdk.r-lib.org) package collects and exports this data over HTTP to a **backend** -- a service that stores your traces and provides a web dashboard where you can search, filter, and visualize them (like the trace screenshots above). | ||
|
|
||
| ### Step 1: Install otelsdk | ||
|
|
||
| ```r | ||
| install.packages("otelsdk") | ||
| ``` | ||
|
|
||
| ### Step 2: Choose a backend | ||
|
|
||
| OpenTelemetry is vendor-neutral, so you can send your data to any compatible backend: | ||
|
|
||
| - **Cloud services**: [Logfire](https://logfire.pydantic.dev/), [Grafana Cloud](https://grafana.com/products/cloud/), [Langfuse](https://langfuse.com/) | ||
| - **Self-hosted**: [Jaeger](https://www.jaegertracing.io/), [Zipkin](https://zipkin.io/), [Prometheus](https://prometheus.io/) | ||
|
|
||
| You can also use a local [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) instead, so no telemetry data ever leaves your network. You're then free to inspect it locally. | ||
|
|
||
| Each backend will give you an endpoint URL and an authentication token. The [otelsdk collecting telemetry data guide](https://otelsdk.r-lib.org/reference/collecting.html#setup) has examples for some common backends. | ||
|
|
||
| ### Step 3: Set environment variables | ||
|
|
||
| Add these to your `.Renviron` file (use `usethis::edit_r_environ()` to open it), replacing the endpoint and token with the values from your chosen backend. This example uses [Logfire](https://logfire.pydantic.dev/), which offers a free tier to get started: | ||
|
|
||
| ``` | ||
| OTEL_TRACES_EXPORTER="http" | ||
| OTEL_EXPORTER_OTLP_ENDPOINT="https://logfire-eu.pydantic.dev" | ||
| OTEL_EXPORTER_OTLP_HEADERS="Authorization=<YOUR-WRITE-TOKEN>" | ||
| ``` | ||
|
|
||
| If you're deploying content to Posit Connect, refer to [how to set environment variables on Posit Connect](https://docs.posit.co/connect/user/content-settings/#content-vars). | ||
|
|
||
| ### Step 4: Run your app | ||
|
|
||
| That's it. Restart R, then run your Shiny app, plumber2 API, or any code that uses the instrumented packages. Traces will flow to your backend automatically. Open your backend's web dashboard to see them -- you'll see a view like the trace screenshots shown above, with each span representing an operation in your application. | ||
|
|
||
| You can verify that tracing is active at any time: | ||
|
|
||
| ```r | ||
| otel::is_tracing_enabled() | ||
| #> [1] TRUE | ||
| ``` | ||
|
|
||
| Importantly, OpenTelemetry is designed to be safe in production. If anything goes wrong in the telemetry code itself, it will never crash your application -- errors are silently suppressed so your app keeps running. | ||
|
|
||
| ## Zero-code instrumentation | ||
|
|
||
| Beyond the packages that ship with built-in instrumentation, otel supports **zero-code instrumentation** for any R package. Set the `OTEL_R_INSTRUMENT_PKGS` environment variable to a comma-separated list of package names, and otel will automatically create spans for their exported functions: | ||
|
|
||
| ``` | ||
| OTEL_R_INSTRUMENT_PKGS=dplyr,tidyr | ||
| ``` | ||
|
|
||
| You can also fine-tune which functions are instrumented using include and exclude filters: | ||
|
|
||
| ``` | ||
| OTEL_R_INSTRUMENT_PKGS_DPLYR_INCLUDE=mutate,filter,select | ||
| ``` | ||
|
|
||
| This is useful for adding visibility to any package in your stack, even those without built-in OTel support. See the [otel documentation](https://otel.r-lib.org/reference/zci.html) for full details. | ||
|
|
||
| ## Configuration options | ||
|
|
||
| The otelsdk package is configured entirely through environment variables, following OpenTelemetry conventions: | ||
|
|
||
| | Variable | Purpose | | ||
| |:---------|:--------| | ||
| | `OTEL_TRACES_EXPORTER` | Exporter type for traces (e.g. `"http"`) | | ||
| | `OTEL_LOGS_EXPORTER` | Exporter type for logs | | ||
| | `OTEL_METRICS_EXPORTER` | Exporter type for metrics | | ||
| | `OTEL_EXPORTER_OTLP_ENDPOINT` | URL of the OTLP-compatible backend | | ||
| | `OTEL_EXPORTER_OTLP_HEADERS` | Authentication headers for the backend | | ||
| | `OTEL_R_INSTRUMENT_PKGS` | Packages for zero-code instrumentation | | ||
| | `OTEL_R_EMIT_SCOPES` | Restrict telemetry to specific packages | | ||
| | `OTEL_R_SUPPRESS_SCOPES` | Exclude specific packages from telemetry | | ||
|
|
||
| See the [otelsdk environment variables reference](https://otelsdk.r-lib.org/reference/environmentvariables.html) for the complete list. | ||
|
|
||
| ## Looking ahead | ||
|
|
||
| With instrumentation built into the packages that R users already rely on, observability becomes something you can turn on, not something you have to build. | ||
shikokuchuo marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| We're also integrating an OpenTelemetry collector into [Posit Connect](https://posit.co/products/enterprise/connect/), giving you an end-to-end observability solution you can simply turn on or off. | ||
|
|
||
| OTel support across the ecosystem continues to expand. If you'd like to learn more: | ||
|
|
||
| - [otel package documentation](https://otel.r-lib.org) -- the instrumentation API | ||
| - [otelsdk package documentation](https://otelsdk.r-lib.org) -- the SDK for collecting and exporting telemetry | ||
| - [Shiny 1.12 OTel blog post](https://shiny.posit.co/blog/posts/shiny-r-1.12/) -- deep dive into Shiny's OpenTelemetry support | ||
| - [OpenTelemetry project](https://opentelemetry.io/) -- the upstream standard | ||
|
|
||
| We're excited about what this opens up for the R community. Whether you're running a Shiny dashboard for a small team, a plumber2 API serving thousands of requests, or a data pipeline distributed across a cluster -- you now have the tools to see exactly what's happening, in real time, with no code changes required. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.