-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Problem
docs/gpu_solver_architecture.md contains a "Current State" section (around line 283) that lists major improvements since the document was written:
## Current State
Since this document was written, major improvements have been made:
1. Adaptive timestepping: BDF2 with LTE control is implemented
2. Unified device models: All devices use OpenVAF-compiled Verilog-A (PSP103, etc.)
3. GPU-resident time loops: Transient uses `lax.scan` for full GPU execution
4. Sparse solver: BCOO/BCSR with `spsolve` for large circuits
5. AC and noise analysis: Frequency-domain analyses are available
Two statements in this list are now inaccurate:
-
Line 3: "Transient uses
lax.scanfor full GPU execution" — the production transient loop (full_mna.py) useslax.while_loop, notlax.scan.lax.scanrequires a fixed number of steps upfront, which is incompatible with the adaptive timestepper. -
Line 4: "Sparse solver: BCOO/BCSR with
spsolve" — the production sparse solver uses Spineax/cuDSS on CUDA GPUs and UMFPACK via FFI on CPU.spsolve(JAX's experimental cuSOLVER wrapper) is not used in the hot path.
Note: issue #98 already covers the historical banner's reference to analysis/dc.py. This issue is about the separate "Current State" section which is supposed to describe the present implementation.