-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Final Performance Results (with actual evaluation)
1. ROOT (native): 0.0009ms/eval (~1.08M eval/s) - baseline
2. pyhs3 FAST_RUN: 0.0086ms/eval (~116k eval/s) - 9.3x slower than ROOT
3. pyhs3 NUMBA: 0.0088ms/eval (~114k eval/s) - 9.5x slower than ROOT
4. pyhs3 JAX: 0.0194ms/eval (~52k eval/s) - 21x slower than ROOT
when using dictionary inputs and converting to positional inputs
pyhs3 improvements:
- FAST_RUN: 0.0068ms/eval (was 0.0086ms) - 21% faster
- NUMBA: 0.0078ms/eval (was 0.0088ms) - 11% faster
- JAX: 0.0177ms/eval (was 0.0194ms) - 9% faster
1. ROOT (native): 0.0002ms/eval - baseline (with caching benefits)
2. pyhs3 FAST_RUN: 0.0068ms/eval - 33x slower
3. pyhs3 NUMBA: 0.0078ms/eval - 38x slower
4. pyhs3 JAX: 0.0177ms/eval - 86x slower
and then using np.array for the inputs to pyhs3/pytensor gives
Before (using plain Python floats):
- FAST_RUN: 0.0067ms/eval (~148k eval/s)
- NUMBA: 0.0084ms/eval (~120k eval/s)
- JAX: 0.0172ms/eval (~58k eval/s)
to
After (using numpy arrays):
- FAST_RUN: 0.0048ms/eval (~208k eval/s) - 40% faster! β‘
- NUMBA: 0.0063ms/eval (~159k eval/s) - 33% faster! β‘
- JAX: 0.0154ms/eval (~65k eval/s) - 12% faster! β‘
with updated performance
1. ROOT (native): 0.0004ms/eval (~2.4M eval/s) - baseline
2. pyhs3 FAST_RUN: 0.0048ms/eval (~208k eval/s) - 11.3x slower (improved from 16.6x!)
3. pyhs3 NUMBA: 0.0063ms/eval (~159k eval/s) - 14.8x slower (improved from 20.6x!)
4. pyhs3 JAX: 0.0154ms/eval (~65k eval/s) - 36.3x slower (improved from 42.4x!)
and then adding trust_input=true
Before trust_input=True:
- FAST_RUN: 0.0048ms/eval (~208k eval/s) - 11.3x slower than ROOT
- NUMBA: 0.0063ms/eval (~159k eval/s) - 14.8x slower than ROOT
- JAX: 0.0154ms/eval (~65k eval/s) - 36.3x slower than ROOT
After trust_input=True:
- FAST_RUN: 0.0008ms/eval (~1.18M eval/s) - only 2.0x slower than ROOT! π
- NUMBA: 0.0022ms/eval (~458k eval/s) - only 5.3x slower than ROOT! π
- JAX: 0.0108ms/eval (~92k eval/s) - only 26.1x slower than ROOT (still improved!)
Metadata
Metadata
Assignees
Labels
No labels