Skip to content

Conversation

@DaniPopes
Copy link
Contributor

@DaniPopes DaniPopes commented May 26, 2025

The original code only uses black_box when dropping, not inside of the measurement.

This actually matters as black_box usually spills the value to the stack, so if the benchmark routine returns a large value, the benchmark will record also an unnecessary memcpy.

@codspeed-hq
Copy link

codspeed-hq bot commented Jun 16, 2025

CodSpeed Instrumentation Performance Report

Merging #98 will degrade performances by 24.19%

Comparing DaniPopes:less-criterion-black-box (b71c93b) with main (6178dd1)

Summary

⚡ 44 improvements
❌ 15 regressions
✅ 108 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
b 2.8 µs 2.2 µs +25.24%
b 3.4 µs 2.5 µs +35.29%
sleep_100ms 566.1 ns 536.9 ns +5.43%
sleep_1ms 595.3 ns 536.9 ns +10.86%
sleep_50ms 566.1 ns 536.9 ns +5.43%
bench_array1[10] 195.8 ns 225 ns -12.96%
bench_array1[1] 119.2 ns 90 ns +32.41%
bench_array1[4] 125.3 ns 154.4 ns -18.88%
bench_array2[10] 195.8 ns 225 ns -12.96%
bench_array2[1] 119.2 ns 90 ns +32.41%
bench_array2[4] 125.3 ns 154.4 ns -18.88%
init_array[1000] 11.5 µs 9.4 µs +22.17%
init_array[4] 125.3 ns 154.4 ns -18.88%
mut_borrow 866.7 ns 895.8 ns -3.26%
add 91.4 ns 120.6 ns -24.19%
recursive_memoized[HashMap<u64, u64>, 0] 805.6 ns 893.3 ns -9.83%
mul 91.4 ns 120.6 ns -24.19%
generate_combinations[5] 8.1 µs 7.4 µs +10.44%
generate_combinations[6] 10.7 µs 9.4 µs +13.83%
generate_combinations[8] 17.3 µs 15.5 µs +11.46%
... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

@codspeed-hq
Copy link

codspeed-hq bot commented Jun 16, 2025

CodSpeed WallTime Performance Report

Merging #98 will degrade performances by 3.01%

Comparing DaniPopes:less-criterion-black-box (b71c93b) with main (6178dd1)

Summary

⚡ 8 improvements
❌ 1 regressions
✅ 145 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
large_setup 14 ns 11 ns +27.27%
from_elem_decimal[2048] 240 ns 210 ns +14.29%
n_queens_solver[5] 9.3 µs 9.6 µs -3.01%
permutations[4] 2.4 µs 2.3 µs +3.59%
permutations[7] 801 µs 771.8 µs +3.79%
generate_gray_code[2] 240 ns 233 ns +3%
sleep_100ns 57.6 µs 26 µs ×2.2
sleep_100us 157.4 µs 134 µs +17.48%
sleep_1us 52.3 µs 42.7 µs +22.42%

@art049 art049 requested a review from not-matthias June 27, 2025 13:38
@art049 art049 force-pushed the less-criterion-black-box branch 2 times, most recently from af3989f to fc76872 Compare June 27, 2025 13:40
Copy link
Member

@not-matthias not-matthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! We definitely should follow what criterion is doing (see their code). Since we're already wrapping it with black_box in the drop we don't have to do it before as well!

LGTM!

DaniPopes and others added 2 commits June 27, 2025 16:28
The original code only uses black_box when dropping, not inside of the
measurement.
Apply the same optimization from PR 98 to both divan and bencher compat layers:
- Remove black_box around routine calls during measurement
- Only apply black_box to outputs after measurement ends
- Reduces stack usage and eliminates unnecessary memory operations
- Improves benchmark performance by avoiding artificial overhead

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@art049 art049 force-pushed the less-criterion-black-box branch from a7c4bcd to b71c93b Compare June 27, 2025 14:28
@art049 art049 merged commit b71c93b into CodSpeedHQ:main Jun 27, 2025
12 of 13 checks passed
@DaniPopes DaniPopes deleted the less-criterion-black-box branch July 5, 2025 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants