Skip to content

Add benchmark overview doc#1528

Open
cijothomas wants to merge 27 commits intoopen-telemetry:mainfrom
cijothomas:cijothomas/benchoverviewdoc1
Open

Add benchmark overview doc#1528
cijothomas wants to merge 27 commits intoopen-telemetry:mainfrom
cijothomas:cijothomas/benchoverviewdoc1

Conversation

@cijothomas
Copy link
Member

This docs is an attempt at the schema for our Phase 2 performance summary, when phase 2 is completed. It defines the key scenarios (Idle, 100k Load, Saturation) and the comparative analysis with OTLP/Collector. I've put TBD for actual numbers, as this is just attempting to finalize what we want to have in an easy to consume format. Actual numbers will be filled in later. This can also be used to see if there are gaps in the perf test suites that we want to add.

The existing pages like https://open-telemetry.github.io/otel-arrow/benchmarks/nightly/backpressure/ are still retained. This doc will have distilled information from them.

@codecov
Copy link

codecov bot commented Dec 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.58%. Comparing base (f182711) to head (cc02ae9).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1528      +/-   ##
==========================================
- Coverage   85.58%   85.58%   -0.01%     
==========================================
  Files         510      510              
  Lines      160290   160290              
==========================================
- Hits       137178   137177       -1     
- Misses      22578    22579       +1     
  Partials      534      534              
Components Coverage Δ
otap-dataflow 87.34% <ø> (-0.01%) ⬇️
query_abstraction 80.61% <ø> (ø)
query_engine 90.23% <ø> (ø)
syslog_cef_receivers ∅ <ø> (∅)
otel-arrow-go 53.50% <ø> (ø)
quiver 91.63% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@lquerel lquerel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this document.

I think we should also include the OTLP to OTLP scenario in the different sections since it will be one of the most common scenarios, at least in the beginning.

I also think we should add the wait_for_result mode in the otel-arrow section because it provides a true end to end unified ack/nack mechanism, which I believe is not fully supported by the Go collector.

github-merge-queue bot pushed a commit that referenced this pull request Jan 14, 2026
Fixed one TODO!

#1528 - Still working
on this separately, which will include actual numbers for key scenarios,
so readers don't have to go through the graphs themselves!
@cijothomas cijothomas marked this pull request as ready for review January 17, 2026 00:54
@cijothomas cijothomas requested a review from a team as a code owner January 17, 2026 00:54
Comment on lines +52 to +53
| Single Core | 0.06% | 27 MB |
| All Cores (128) | 2.5% | 600 MB |
Copy link
Member

@reyang reyang Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things to consider:

  1. What's the memory usage on different CPU architecture (ARM64, AMD64, etc.)?
  2. What's the trend as the number of cores increase? I guess it is C + N * R where C is a constant, N is the number of cores and R is the per core memory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for 2, added benchmarks test to confirm it.

Found 6 idle state result directories
Found: 16 core(s) -> 97.08 MiB
Found: 1 core(s) -> 27.60 MiB
Found: 2 core(s) -> 29.52 MiB
Found: 32 core(s) -> 161.8 MiB
Found: 4 core(s) -> 38.14 MiB
Found: 8 core(s) -> 57.16 MiB
Benchmark JSON written to: /home/opc/actions-runner/_work/otel-arrow/otel-arrow/tools/pipeline_perf_test/results/idle-memory-scaling.json

================================================================================
IDLE STATE MEMORY SCALING ANALYSIS
================================================================================

Goal: Verify linear memory scaling (Memory = C + N × R)

  C (Constant Overhead):   22.15 MiB
  R (Per-Core Overhead):   4.420 MiB/core
  R² (Fit Quality):        0.9980

Formula: Memory (MiB) ≈ 22.1 + 4.42 × N

--------------------------------------------------------------------------------
Cores    Actual (MiB)       Predicted (MiB)    Error (%)    Status    
--------------------------------------------------------------------------------
1        27.60              26.57              3.8%         ✅
2        29.52              30.99              5.0%         ✅
4        38.14              39.83              4.4%         ✅
8        57.16              57.51              0.6%         ✅
16       97.08              92.87              4.3%         ✅
32       161.8              163.6              1.1%         ✅
--------------------------------------------------------------------------------

SUMMARY:
  • Each additional core adds ~4.420 MiB of memory overhead
  • Base memory footprint (shared infrastructure): ~22.15 MiB
  • Memory range: 27.60 MiB (1 core) → 161.8 MiB (32 cores)

✅ EXCELLENT: Near-perfect linear fit (R² ≥ 0.99).
   Memory scaling follows the share-nothing model precisely.

@github-actions
Copy link

github-actions bot commented Feb 4, 2026

This pull request has been marked as stale due to lack of recent activity. It will be closed in 30 days if no further activity occurs. If this PR is still relevant, please comment or push new commits to keep it active.

@github-actions github-actions bot added the stale Not actively pursued label Feb 4, 2026
@jmacd jmacd removed the stale Not actively pursued label Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

6 participants