Release v1.2.1 · openvm-org/stark-backend

This release offers major performance improvements to the CUDA backend. It includes a new Virtual Pool Memory Manager (VPMM) in openvm-cuda-common that provides multi-stream memory management using CUDA driver APIs to avoid memory fragmentation. Several kernels in openvm-cuda-backend were also optimized to give significant performance gains.

Added

(CUDA common) New memory manager with Virtual Pool (VPMM Spec) with multi-stream support built on top of the CUDA Virtual Memory Management driver API

Changed

(CUDA common) Multi-arch build support
(CUDA backend) Quotient values kernel optimization
(CUDA backend) FRI reduced opening kernel optimization by removing bit reversal for better memory access patterns

What's Changed

ci: add custom RunsOn runners by @jonathanpwang in #117
ci: pin gpu image to cuda 12.9 by @jonathanpwang in #116
feat(cuda): Virtual Pool Memory Manager by @gaxiom in #114
feat: utility to generate SymbolicConstraintsDag statistics by @stephenh-axiom-xyz in #118
fix(cuda): Stop using constant twiddle by @gaxiom in #119
chore: use newer ami by @luffykai in #124
chore(cuda): Ntt refactoring by @gaxiom in #123
fix(cuda): NTT edge case by @gaxiom in #126
feat: support multiple CUDA archs by @gaxiom in #127
docs(readme): Fix Crate Docs link in README by @jonathanpwang in #129
feat: Quotient evaluation optimization by @stephenh-axiom-xyz in #131
chore: bump workspace to v1.2.1-rc.2 by @jonathanpwang in #132
chore(cuda): universal babybear impl by @gaxiom in #130
chore(cuda): CUDA_DEBUG by @gaxiom in #133
chore: bump workspace to v1.2.1-rc.3 by @jonathanpwang in #134
perf(cuda): Opener in natural order by @gaxiom in #135
feat(cuda): Support multithreaded VPMM + bump up to 1.2.1-rc.4 by @gaxiom in #136
fix(cuda): VPMM reallocate after auto-cleanup by @gaxiom in #138
chore(cuda): auto-cleanup VPMM fix + multithreaded Ci + zero pages allow by @gaxiom in #139
chore: bump workspace to v1.2.1-rc.5 by @jonathanpwang in #143
chore: loosen tokio versioning by @jonathanpwang in #146
feat(cuda): VPMM v3 by @gaxiom in #148
fix: cost estimate unread variable by @jonathanpwang in #151
fix: don't defrag what we already defraged by @gaxiom in #150
fix(cuda): set device order by @gaxiom in #152
chore(cuda): VPMM order changed by @gaxiom in #153
release: v1.2.1 with updated changelog by @jonathanpwang in #156

Full Changelog: v1.2.0...v1.2.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.2.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Added

Changed

What's Changed

Contributors

Uh oh!