open-telemetry
diff --git a/‎.github/ISSUE_TEMPLATE/BUG-REPORT.yml‎
Lines changed: 1 addition & 1 deletion b/‎.github/ISSUE_TEMPLATE/BUG-REPORT.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/benchmark.yml‎
Lines changed: 54 additions & 0 deletions b/‎.github/workflows/benchmark.yml‎
Lines changed: 54 additions & 0 deletions
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 2 additions & 4 deletions b/‎.github/workflows/ci.yml‎
Lines changed: 2 additions & 4 deletions
diff --git a/‎.github/workflows/integration_tests.yml‎
Lines changed: 2 additions & 0 deletions b/‎.github/workflows/integration_tests.yml‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎.github/workflows/pr_criterion.yaml‎
Lines changed: 0 additions & 28 deletions b/‎.github/workflows/pr_criterion.yaml‎
Lines changed: 0 additions & 28 deletions
diff --git a/‎.github/workflows/pr_naming.yml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/pr_naming.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 2 additions & 2 deletions b/‎CONTRIBUTING.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎Cargo.toml‎
Lines changed: 7 additions & 0 deletions b/‎Cargo.toml‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 6 additions & 3 deletions b/‎README.md‎
Lines changed: 6 additions & 3 deletions
diff --git a/‎docs/adr/001_error_handling.md‎
Lines changed: 173 additions & 0 deletions b/‎docs/adr/001_error_handling.md‎
Lines changed: 173 additions & 0 deletions
@@ -28,7 +28,7 @@ body:
   - type: textarea
     id: sdk-version
     attributes:
-      label: label: OpenTelemetry SDK Version (i.e version of `opentelemetry_sdk` crate)
+      label: OpenTelemetry SDK Version (i.e version of `opentelemetry_sdk` crate)
       description: What version of the `opentelemetry_sdk` crate are you using?
       placeholder: 0.x, 1.x, etc.
     validations:
 
@@ -0,0 +1,54 @@
+# This workflow runs a Criterion benchmark on a PR and compares the results against the base branch.
+# It is triggered on a PR or a push to main.
+#
+# The workflow is gated on the presence of the "performance" label on the PR.
+#
+# The workflow runs on a self-hosted runner pool. We can't use the shared runners for this,
+# because they are only permitted to run on the default branch to preserve resources. 
+#
+# In the future, we might like to consider using bencher.dev or the framework used by otel-golang here. 
+on: 
+  pull_request:
+  push:
+    branches:
+      - main
+name: benchmark pull requests
+jobs:
+  runBenchmark:
+    name: run benchmark
+    permissions:
+      pull-requests: write
+
+    # If we're running on a PR, use ubuntu-latest - a shared runner. We can't use the self-hosted
+    # runners on arbitrary PRs, and we don't want to unleash that load on the pool anyway.     
+    # If we're running on main, use the OTEL self-hosted runner pool. 
+    runs-on: ${{ github.event_name == 'pull_request' && 'ubuntu-latest' || 'self-hosted' }}
+    if: ${{ (github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'performance')) || github.event_name == 'push' }}
+    env:
+      # For PRs, compare against the base branch - e.g., 'main'. 
+      # For pushes to main, compare against the previous commit
+      BRANCH_NAME: ${{ github.event_name == 'pull_request' && github.base_ref || github.event.before }}
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 10  # Fetch current commit and its parent
+      - uses: arduino/setup-protoc@v3
+        with:
+          repo-token: ${{ secrets.GITHUB_TOKEN }}
+      - uses: dtolnay/rust-toolchain@master
+        with:
+          toolchain: stable
+      - uses: boa-dev/criterion-compare-action@v3
+        with:
+          cwd: opentelemetry
+          branchName: ${{ env.BRANCH_NAME }}
+      - uses: boa-dev/criterion-compare-action@v3
+        with:
+          cwd: opentelemetry-appender-tracing
+          features: spec_unstable_logs_enabled
+          branchName: ${{ env.BRANCH_NAME }}
+      - uses: boa-dev/criterion-compare-action@v3
+        with:
+          cwd: opentelemetry-sdk
+          features: rt-tokio,testing,metrics,logs,spec_unstable_metrics_views
+          branchName: ${{ env.BRANCH_NAME }}
@@ -62,10 +62,8 @@ jobs:
     - uses: arduino/setup-protoc@v3
       with:
           repo-token: ${{ secrets.GITHUB_TOKEN }}
-    - uses: actions-rs/cargo@v1
-      with:
-        command: fmt
-        args: --all -- --check
+    - name: Format
+      run: cargo fmt --all -- --check
     - name: Lint
       run: bash ./scripts/lint.sh
   external-types:
 
@@ -23,5 +23,7 @@ jobs:
         with:
           components: rustfmt
       - uses: arduino/setup-protoc@v3
+        with:
+          repo-token: ${{ secrets.GITHUB_TOKEN }}
       - name: Run integration tests
         run: ./scripts/integration_tests.sh
@@ -9,7 +9,7 @@ jobs:
     runs-on: ubuntu-latest
     steps:
       - name: PR Conventional Commit Validation
-        uses:  ytanikin/[email protected].0
+        uses:  ytanikin/[email protected].1
         with:
           task_types: '["build","chore","ci","docs","feat","fix","perf","refactor","revert","test"]'
           add_label: 'false'
@@ -8,7 +8,7 @@ for specific dates and for Zoom meeting links. "OTel Rust SIG" is the name of
 meeting for this group.
 
 Meeting notes are available as a public [Google
-doc](https://docs.google.com/document/d/1tGKuCsSnyT2McDncVJrMgg74_z8V06riWZa0Sr79I_4/edit).
+doc](https://docs.google.com/document/d/12upOzNk8c3SFTjsL6IRohCWMgzLKoknSCOOdMakbWo4/edit).
 If you have trouble accessing the doc, please get in touch on
 [Slack](https://cloud-native.slack.com/archives/C03GDP0H023).
 
@@ -172,7 +172,7 @@ It's important to regularly review and remove the `otel_unstable` flag from the
 
 The potential features include:
 
-- Stable and non-experimental features that compliant to specification, and have a feature flag to minimize compilation size. Example: feature flags for signals (like `logs`, `traces`, `metrics`) and runtimes (`rt-tokio`, `rt-tokio-current-thread`, `rt-async-std`).
+- Stable and non-experimental features that are compliant with the specification and have a feature flag to minimize compilation size. Example: feature flags for signals (like `logs`, `traces`, `metrics`) and runtimes (`rt-tokio`, `rt-tokio-current-thread`).
 - Stable and non-experimental features, although not part of the specification, are crucial for enhancing the tracing/log crate's functionality or boosting performance. These features are also subject to discussion and approval by the OpenTelemetry Rust Maintainers.
 
 All such features should adhere to naming convention  `<signal>_<feature_name>`
 
@@ -85,5 +85,12 @@ opentelemetry = { path = "opentelemetry" }
 opentelemetry_sdk = { path = "opentelemetry-sdk" }
 opentelemetry-stdout = { path = "opentelemetry-stdout" }
 
+[workspace.lints.rust]
+rust_2024_compatibility = { level = "warn", priority = -1 }
+# No need to enable those, because it either not needed or results in ugly syntax
+edition_2024_expr_fragment_specifier = "allow"
+if_let_rescope = "allow"
+tail_expr_drop_order = "allow"
+
 [workspace.lints.clippy]
 all = { level = "warn", priority = 1 }
@@ -30,10 +30,13 @@ documentation.
 
 | Signal/Component      | Overall Status     |
 | --------------------  | ------------------ |
+| Context               | Beta               |
+| Baggage               | RC                 |
+| Propagators           | Beta               |
 | Logs-API              | Stable*            |
-| Logs-SDK              | RC                 |
+| Logs-SDK              | Stable              |
 | Logs-OTLP Exporter    | RC                 |
-| Logs-Appender-Tracing | RC                 |
+| Logs-Appender-Tracing | Stable             |
 | Metrics-API           | Stable             |
 | Metrics-SDK           | RC                 |
 | Metrics-OTLP Exporter | RC                 |
@@ -178,7 +181,6 @@ you're more than welcome to participate!
 
 * [Cijo Thomas](https://github.com/cijothomas)
 * [Harold Dost](https://github.com/hdost)
-* [Julian Tescher](https://github.com/jtescher)
 * [Lalit Kumar Bhasin](https://github.com/lalitb)
 * [Utkarsh Umesan Pillai](https://github.com/utpilla)
 * [Zhongyang Wu](https://github.com/TommyCpp)
@@ -192,6 +194,7 @@ you're more than welcome to participate!
 
 * [Dirkjan Ochtman](https://github.com/djc)
 * [Jan Kühle](https://github.com/frigus02)
+* [Julian Tescher](https://github.com/jtescher)
 * [Isobel Redelmeier](https://github.com/iredelmeier)
 * [Mike Goldsmith](https://github.com/MikeGoldsmith)
 
 
@@ -0,0 +1,173 @@
+# Error handling patterns in public API interfaces
+## Date
+27 Feb 2025 
+
+## Summary
+
+This ADR describes the general pattern we will follow when modelling errors in public API interfaces - that is, APIs that are exposed to users of the project's published crates. It summarises the discussion and final option from [#2571](https://github.com/open-telemetry/opentelemetry-rust/issues/2571); for more context check out that issue. 
+
+We will focus on the exporter traits in this example, but the outcome should be applied to _all_ public traits and their fallible operations. 
+
+These include [SpanExporter](https://github.com/open-telemetry/opentelemetry-rust/blob/eca1ce87084c39667061281e662d5edb9a002882/opentelemetry-sdk/src/trace/export.rs#L18), [LogExporter](https://github.com/open-telemetry/opentelemetry-rust/blob/eca1ce87084c39667061281e662d5edb9a002882/opentelemetry-sdk/src/logs/export.rs#L115), and [PushMetricExporter](https://github.com/open-telemetry/opentelemetry-rust/blob/eca1ce87084c39667061281e662d5edb9a002882/opentelemetry-sdk/src/metrics/exporter.rs#L11) which form part of the API surface of `opentelemetry-sdk`.
+
+There are various ways to handle errors on trait methods, including swallowing them and logging, panicing, returning a shared global error, or returning a method-specific error. We strive for consistency, and we want to be sure that we've put enough thought into what this looks like that we don't have to make breaking interface changes unecessarily in the future.
+
+## Design Guidance
+
+### 1. No panics from SDK APIs
+Failures during regular operation should not panic, instead returning errors to the caller where appropriate, _or_ logging an error if not appropriate.
+Some of the opentelemetry SDK interfaces are dictated by the specification in way such that they may not return errors. 
+
+### 2. Consolidate error types within a trait where we can, let them diverge when we can't**
+
+We aim to consolidate error types where possible _without indicating a function may return more errors than it can actually return_. 
+
+**Don't do this** - each function's signature indicates that it returns errors it will _never_ return, forcing the caller to write handlers for dead paths:
+```rust
+enum MegaError {
+  TooBig,
+  TooSmall,
+  TooLong,
+  TooShort
+}
+
+trait MyTrait {
+
+  // Will only ever return TooBig,TooSmall errors
+  fn action_one() -> Result<(), MegaError>;
+
+  // These will only ever return TooLong,TooShort errors
+  fn action_two() -> Result<(), MegaError>;
+  fn action_three() -> Result<(), MegaError>;
+}
+```
+
+**Instead, do this** - each function's signature indicates only the errors it can return, providing an accurate contract to the caller:
+
+```rust
+enum ErrorOne {
+  TooBig,
+  TooSmall,
+}
+
+enum ErrorTwo {
+  TooLong,
+  TooShort
+}
+
+trait MyTrait {
+  fn action_one() -> Result<(), ErrorOne>;
+
+  // Action two and three share the same error type. 
+  // We do not introduce a common error MyTraitError for all operations, as this would
+  // force all methods on the trait to indicate they return errors they do not return,
+  // complicating things for the caller.  
+  fn action_two() -> Result<(), ErrorTwo>;
+  fn action_three() -> Result<(), ErrorTwo>;
+}
+```
+
+## 3. Consolidate error types between signals where we can, let them diverge where we can't
+
+Consider the `Exporter`s mentioned earlier. Each of them has the same failure indicators - as dicated by the OpenTelemetry spec  - and we will
+share the error types accordingly: 
+
+**Don't do this** - each signal has its own error type, despite having exactly the same failure cases: 
+
+```rust
+#[derive(Error, Debug)]
+pub enum OtelTraceError {
+    #[error("Shutdown already invoked")]
+    AlreadyShutdown,
+    
+    #[error("Operation failed: {0}")]
+    InternalFailure(String),
+
+    /** ... additional errors ... **/ 
+}
+
+#[derive(Error, Debug)]
+pub enum OtelLogError {
+    #[error("Shutdown already invoked")]
+    AlreadyShutdown,
+    
+    #[error("Operation failed: {0}")]
+    InternalFailure(String),
+
+    /** ... additional errors ... **/ 
+}
+```
+
+**Instead, do this** - error types are consolidated between signals where this can be done appropriately:
+
+```rust
+
+/// opentelemetry-sdk::error
+
+#[derive(Error, Debug)]
+pub enum OTelSdkError {
+    #[error("Shutdown already invoked")]
+    AlreadyShutdown,
+    
+    #[error("Operation failed: {0}")]
+    InternalFailure(String),
+
+    /** ... additional errors ... **/ 
+}
+
+pub type OTelSdkResult = Result<(), OTelSdkError>;
+
+/// signal-specific exporter traits all share the same 
+/// result types for the exporter operations.
+
+// pub trait LogExporter {
+// pub trait SpanExporter {
+pub trait PushMetricExporter {
+    fn export(&self, /* ... */) -> OtelSdkResult;
+    fn force_flush(&self, /* ... */ ) -> OTelSdkResult;
+    fn shutdown(&self, /* ... */ ) -> OTelSdkResult;
+```
+
+If this were _not_ the case - if we needed to mark an extra error for instance for `LogExporter` that the caller could reasonably handle - 
+we would let that error traits diverge at that point. 
+
+### 4. Box custom errors where a savvy caller may be able to handle them, stringify them if not
+
+Note above that we do not box any `Error` into `InternalFailure`. Our rule here is that if the caller cannot reasonably be expected to handle a particular error variant, we will use a simplified interface that returns only a descriptive string. In the concrete example we are using with the exporters, we have a [strong signal in the opentelemetry-specification](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/sdk.md#export) that indicates that the error types _are not actionable_ by the caller. 
+
+If the caller may potentially recover from an error, we will follow the generally-accepted best practice (e.g., see [canonical's guide](https://canonical.github.io/rust-best-practices/error-and-panic-discipline.html) and instead preserve the nested error:
+
+**Don't do this if the OtherError is potentially recoverable by a savvy caller**:
+```rust
+
+#[derive(Debug, Error)]
+pub enum MyError {
+    #[error("Error one occurred")]
+    ErrorOne, 
+
+    #[error("Operation failed: {0}")]
+    OtherError(String),
+```
+
+**Instead, do this**, allowing the caller to match on the nested error:
+
+```rust
+#[derive(Debug, Error)]
+pub enum MyError {
+    #[error("Error one occurred")]
+    ErrorOne, 
+
+    #[error("Operation failed: {source}")]
+    OtherError {
+        #[from]
+        source: Box<dyn Error + Send + Sync>,
+    },
+}
+```
+
+Note that at the time of writing, there is no instance we have identified within the project that has required this. 
+
+### 5. Use thiserror by default
+We will use [thiserror](https://docs.rs/thiserror/latest/thiserror/) by default to implement Rust's [error trait](https://doc.rust-lang.org/core/error/trait.Error.html).
+This keeps our code clean, and as it does not appear in our interface, we can choose to replace any particular usage with a hand-rolled implementation should we need to.
+