Skip to content

Adds SMP tests for idle & ascii#1826

Merged
preinlein merged 1 commit intomainfrom
paul.reinlein/introduce-smp-tests
Mar 19, 2026
Merged

Adds SMP tests for idle & ascii#1826
preinlein merged 1 commit intomainfrom
paul.reinlein/introduce-smp-tests

Conversation

@preinlein
Copy link
Contributor

@preinlein preinlein commented Mar 13, 2026

What does this PR do?

Introduces SMP (Single Machine Performance) regression detector experiments for lading. Adds an idle baseline test and four HTTP ASCII throughput tests (1, 10, 100, 1000 MiB/s) with memory and CPU quality gates.

Motivation

Establishes continuous performance regression detection for lading using the SMP regression detector. The idle test provides a resource floor, while the ASCII tests cover the HTTP generator + blackhole path across a range of throughput levels. Each experiment enforces bounds on total_pss_bytes and avg(total_cpu_usage_millicores).

Related issues

N/A

Additional Notes

  • Each ASCII experiment is paired with the local benchmark in lading_payload/benches/ascii.rs; a cross-reference comment was added to the benchmark file.
  • All experiments run lading in --no-target --experiment-duration-infinite mode (shutdown performance is out of scope).
  • experiments/regression/config.yaml pins lading version to 0.31.2 with profiling replicas disabled.

Copy link
Contributor Author

preinlein commented Mar 13, 2026

@preinlein preinlein changed the title Adds SMP tests for idle & ascii [DO NOT REVIEW] Adds SMP tests for idle & ascii Mar 13, 2026
@preinlein
Copy link
Contributor Author

@preinlein preinlein force-pushed the paul.reinlein/introduce-smp-tests branch from 3088726 to 9696ba7 Compare March 16, 2026 16:59
@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from 808f116 to 3d782ef Compare March 16, 2026 16:59
version: 0.31.2

target:
ddprof_replicas: 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: i was running into some strange profiling errors so I turned this off temporarily.

Easy enough to re-add and re-test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be interested in seeing those errors if you have a link handy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, too old. It's one of the many many runs I did last week and have totally lost track of it.

It's going to be easy enough to reproduce though.

@preinlein preinlein force-pushed the paul.reinlein/introduce-smp-tests branch from 9696ba7 to 530bc15 Compare March 16, 2026 17:10
@preinlein preinlein changed the title [DO NOT REVIEW] Adds SMP tests for idle & ascii Adds SMP tests for idle & ascii Mar 16, 2026
@preinlein
Copy link
Contributor Author

Container build errors are fixed in #1827

@preinlein preinlein marked this pull request as ready for review March 16, 2026 17:23
@preinlein preinlein requested a review from a team as a code owner March 16, 2026 17:23
@preinlein
Copy link
Contributor Author

Latest run with 100 replicas:

Regression Detector Results

Run ID: 5b1183a5-5725-4e58-b3fc-1826d36be97c
Baseline: sha-808f116dd45a4fb0f328a1561df0880ffcb9522f
Comparison: sha-808f116dd45a4fb0f328a1561df0880ffcb9522f

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI trials links
idle memory utilization +0.08 [+0.06, +0.11] 1
http_ascii_1000mib memory utilization -0.00 [-0.06, +0.05] 1
http_ascii_100mib memory utilization -0.01 [-0.05, +0.04] 1
http_ascii_1mib memory utilization -0.05 [-0.11, +0.01] 1
http_ascii_10mib memory utilization -0.06 [-0.10, -0.02] 1

Bounds Checks: ✅ Passed

perf experiment bounds_check_name replicates_passed observed_value links
http_ascii_1000mib cpu_usage 100/100 317.35 ≤ 400
http_ascii_1000mib memory_usage 100/100 1.02GiB ≤ 1.10GiB
http_ascii_100mib cpu_usage 100/100 88.73 ≤ 155
http_ascii_100mib memory_usage 100/100 522.85MiB ≤ 575MiB
http_ascii_10mib cpu_usage 100/100 9.58 ≤ 14
http_ascii_10mib memory_usage 100/100 120.94MiB ≤ 132MiB
http_ascii_1mib cpu_usage 100/100 1.33 ≤ 2.40
http_ascii_1mib memory_usage 100/100 30.48MiB ≤ 34MiB
idle cpu_usage 100/100 0.35 ≤ 0.60
idle memory_usage 100/100 12.46MiB ≤ 14MiB

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 530bc15bfa

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

parallel_connections: 1
method:
post:
maximum_prebuild_cache_size_bytes: "5 GiB"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep prebuild cache size within u32 limits

Setting maximum_prebuild_cache_size_bytes to "5 GiB" here silently overflows when the HTTP generator builds its cache: Http::new converts this byte value with as u32 (lading/src/generator/http.rs), so values above u32::MAX wrap instead of erroring. In this case, the intended 5 GiB cache becomes 1 GiB at runtime, which materially changes memory behavior and makes the 1000 MiB/s regression gate measure a different workload than configured.

Useful? React with 👍 / 👎.

@preinlein preinlein force-pushed the paul.reinlein/introduce-smp-tests branch from 530bc15 to 20ad18b Compare March 18, 2026 11:26
@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from 3d782ef to cc851e7 Compare March 18, 2026 11:26
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 20ad18b2ea

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

headers: {}
target_uri: "http://127.0.0.1:8080/"
bytes_per_second: "1000 MiB"
parallel_connections: 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard this case against silently under-driving the 1 GiB/s load

With the default maximum_block_size of 1 MiB (lading_payload::block::default_maximum_block_size), this config has to complete about 1000 HTTP request/response cycles per second to really deliver 1000 MiB/s. Http::spin only allows one in-flight request per configured connection, so parallel_connections: 1 makes the highest-throughput case depend on a single loopback connection keeping up. When that does not hold on a given runner, lading simply falls behind the target rate and experiment.yaml still passes because it only bounds CPU/PSS, not achieved bytes, so the "1000 MiB/s" regression gate ends up measuring an unknown lower load.

Useful? React with 👍 / 👎.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, this feels like a bug that isn't being caught at runtime.

I'd prefer for lading to fail in some way rather than silently "fail".

Comment on lines +7 to +9
target_metrics:
- prometheus:
uri: "http://127.0.0.1:9000/metrics"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove Prometheus self-scraping from the idle baseline

The idle gate in cases/idle/experiment.yaml only checks total_pss_bytes and avg(total_cpu_usage_millicores), both of which come from the observer, but this extra target_metrics entry makes the controller issue a Prometheus scrape against the target every sample period (target_metrics::prometheus::Prometheus::run). In the idle case that periodic /metrics traffic becomes one of the main CPU sources, so the baseline is no longer measuring a quiescent lading process; it is measuring the exporter-plus-scraper path as well.

Useful? React with 👍 / 👎.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I want this as is. But maybe it's worth while to update the README to reflect this? Don't feel too strongly, I'll wait and see what reviewer says.

@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from cc851e7 to a479460 Compare March 18, 2026 14:01
@preinlein preinlein force-pushed the paul.reinlein/introduce-smp-tests branch from 20ad18b to b8a9099 Compare March 18, 2026 14:01
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8a9099af9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +6 to +10
bytes_per_second: "100 MiB"
parallel_connections: 1
method:
post:
maximum_prebuild_cache_size_bytes: "500 MiB"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match maximum_block_size to the paired ASCII benchmark

These HTTP ASCII cases are described as being paired with lading_payload/benches/ascii.rs, but leaving maximum_block_size unset makes the HTTP generator fall back to 1 MiB blocks (lading/src/generator/http.rs, lading_payload::block::default_maximum_block_size). The criterion bench's 10/100/1000 MiB entries serialize single buffers of those exact sizes, whereas this config only exercises repeated <=1 MiB serializations to reach 100 MiB/s. That means regressions in large-buffer ASCII generation will not be caught by the SMP gate even though the README says the two should stay in sync.

Useful? React with 👍 / 👎.

Comment on lines +7 to +9
target_metrics:
- prometheus:
uri: "http://127.0.0.1:9000/metrics"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove self-scraping from the 1 MiB/s regression case

At this load the target is only generating about one 1 MiB block per second because HTTP defaults maximum_block_size to 1 MiB (lading_payload/src/block.rs), but target_metrics.prometheus makes the observer issue another HTTP request against /metrics every sample period (1 Hz by default in lading/src/config.rs, via target_metrics::prometheus::Prometheus::run). In practice that means the exporter/scraper path becomes comparable to the actual workload, so this gate no longer isolates the 1 MiB/s generator+blackhole path and can miss or misattribute low-throughput regressions.

Useful? React with 👍 / 👎.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8a9099af9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +8 to +10
method:
post:
maximum_prebuild_cache_size_bytes: "500 MiB"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Set maximum_block_size to the benchmarked payload size

For the 10/100/1000 MiB/s cases, ascii_throughput in lading_payload/benches/ascii.rs benchmarks serializing one buffer of that exact size, but these configs never set maximum_block_size, so lading::generator::http::Config falls back to lading_payload::block::default_maximum_block_size() = 1 MiB. In practice these SMP cases send a stream of 1 MiB requests instead of exercising 10/100/1000 MiB ASCII serialization, so regressions that only show up on large contiguous buffers will be missed even though the README/comment says the experiments are paired with that benchmark.

Useful? React with 👍 / 👎.

Comment on lines +23 to +27
- name: cpu_usage
description: "CPU usage quality gate. Bounds total average millicore usage."
bounds:
series: avg(total_cpu_usage_millicores)
upper_bound: 2.4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add a throughput check to the smaller ASCII SMP cases

This is the last check in the file, so the 1/10/100 MiB/s experiments introduced here only bound PSS and CPU. Because the controller is already scraping the target's Prometheus telemetry via target_metrics, a generator regression that under-drives the advertised rate would make these cases cheaper and still pass green, meaning the regression detector can stop exercising the intended load without any signal. (The 1000 MiB/s case already has a separate comment; the same missing lower-bound exists for the smaller throughput cases too.)

Useful? React with 👍 / 👎.

@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from a479460 to db73f9e Compare March 18, 2026 14:40
@preinlein preinlein force-pushed the paul.reinlein/introduce-smp-tests branch from b8a9099 to 89f0f93 Compare March 18, 2026 14:40
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89f0f939a5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +21 to +22
series: total_pss_bytes
upper_bound: "14 MiB"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop binding these SMP cases to observer-only metrics

Checked the new cases/idle and cases/http_ascii_* configs: the only metric source wired into case/lading/lading.yaml is target_metrics.prometheus, but the target command launches lading with --no-target. In inner_main, --no-target makes config.target = None, so the observer is skipped entirely (lading/src/bin/lading.rs:603-615), and the total_pss_bytes / total_cpu_usage_millicores series used by these new quality gates are only emitted by the observer (lading/src/observer/linux/procfs.rs:161-164, lading/src/observer/linux/cgroup/v2/cpu.rs:115-123). That means these experiments never export the memory/CPU series they are checking here, so the new gates either fail on missing data or silently stop enforcing the intended limits.

Useful? React with 👍 / 👎.

# This is acceptable as we are less concerned about the shutdown performance
command: "/usr/bin/lading --no-target --experiment-duration-infinite --config-path /etc/lading-target/lading.yaml"
cpu_allotment: 2
memory_allotment: 1250 MiB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this doesn't OOM given the 5GiB prebuild cache - am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the issue that codex spotted above.

lading is limiting the cache size to 1G

imo, this feels like a bug in lading that I'd like to address

@preinlein preinlein changed the base branch from paul.reinlein/fix-get-available-memory to graphite-base/1826 March 19, 2026 12:23
@preinlein preinlein force-pushed the paul.reinlein/introduce-smp-tests branch from 89f0f93 to 2f5b785 Compare March 19, 2026 12:26
@preinlein preinlein force-pushed the graphite-base/1826 branch from db73f9e to 2619756 Compare March 19, 2026 12:26
@preinlein preinlein changed the base branch from graphite-base/1826 to paul.reinlein/fix-get-available-memory March 19, 2026 12:26
@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from 2619756 to b573bc9 Compare March 19, 2026 12:26
@preinlein preinlein force-pushed the paul.reinlein/introduce-smp-tests branch from 2f5b785 to be1c3d4 Compare March 19, 2026 12:26
@chatgpt-codex-connector
Copy link

💡 Codex Review

method:
post:
maximum_prebuild_cache_size_bytes: "100 MiB"
variant: "ascii"

P2 Badge Align the 10 MiB SMP case with the paired ASCII benchmark

This case is described as paired with lading_payload/benches/ascii.rs, but the HTTP generator falls back to maximum_block_size = 1 MiB when that field is omitted (lading/src/generator/http.rs:72-74, lading_payload/src/block.rs:148-149). The benchmark's 10 MiB entry serializes one 10 MiB buffer (lading_payload/benches/ascii.rs:24-35), whereas this config will reach 10 MiB/s by sending ten <=1 MiB blocks, so regressions that only appear on contiguous 10 MiB ASCII generation are never exercised by the new SMP gate.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@preinlein preinlein changed the base branch from paul.reinlein/fix-get-available-memory to graphite-base/1826 March 19, 2026 12:48
Copy link
Contributor Author

preinlein commented Mar 19, 2026

Merge activity

  • Mar 19, 1:25 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Mar 19, 2:05 PM UTC: Graphite rebased this pull request as part of a merge.
  • Mar 19, 2:33 PM UTC: Graphite couldn't merge this PR because it failed for an unknown reason (GitHub is reporting that this PR is not mergeable, despite passing required status checks defined by your branch protection rules. Please check your rulesets for additional blocking criteria. Graphite Merge Queue does not currently support rulesets. Please contact Graphite support for further assistance.).
  • Mar 19, 2:52 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Mar 19, 2:52 PM UTC: @preinlein merged this pull request with Graphite.

@preinlein preinlein force-pushed the graphite-base/1826 branch from b573bc9 to f24286f Compare March 19, 2026 13:55
@preinlein preinlein force-pushed the paul.reinlein/introduce-smp-tests branch from be1c3d4 to 3656530 Compare March 19, 2026 13:55
@preinlein preinlein changed the base branch from graphite-base/1826 to paul.reinlein/fix-get-available-memory March 19, 2026 13:55
@preinlein preinlein changed the base branch from paul.reinlein/fix-get-available-memory to graphite-base/1826 March 19, 2026 14:03
@preinlein preinlein changed the base branch from graphite-base/1826 to main March 19, 2026 14:03
@preinlein preinlein force-pushed the paul.reinlein/introduce-smp-tests branch from 3656530 to 5cbf63d Compare March 19, 2026 14:04
@preinlein preinlein force-pushed the paul.reinlein/introduce-smp-tests branch from 5cbf63d to a7d2219 Compare March 19, 2026 14:37
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7d2219459

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +3 to +4
//! Paired with SMP regression experiments in `experiments/regression/cases/http_ascii_*`.
//! If throughput sizes here change, update the corresponding SMP experiments and vice versa.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop pairing cached HTTP cases with ascii_throughput

If these new http_ascii_* SMP cases are meant to catch regressions in lading_payload/benches/ascii.rs, they do not actually exercise the same hot path. I checked lading/src/generator.rs:5-10, lading/src/generator/http.rs:155-181,224-233, and lading/src/bin/lading.rs:546-549,624-633: the HTTP generator prebuilds its entire block cache during startup, before warmup begins, and spin() only rotates cached Bytes afterwards. So a slowdown in ascii::Ascii::to_bytes can still leave these experiments green as long as startup time is unchanged, which makes the new pairing comment misleading and leaves the advertised regression coverage gap in place.

Useful? React with 👍 / 👎.

@preinlein preinlein merged commit 3febd2b into main Mar 19, 2026
31 checks passed
@preinlein preinlein deleted the paul.reinlein/introduce-smp-tests branch March 19, 2026 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants