Releases: Deep-Learning-Profiling-Tools/triton-viz
Releases · Deep-Learning-Profiling-Tools/triton-viz
v3.0
Visualizers are fun, but sometimes debugging kernels requires more than looking at tensors... We rewrote the architecture of Triton-Viz to support new analysis clients and DSLs!
On top of the visualizer, we've added two new analysis clients:
- Profiler: Flags performance hazards like non-unrolled loops, inefficient mask usage, and missing buffer_load optimizations.
- Sanitizer: Symbolically checks tensor memory accesses for out-of-bounds errors.
Just add one decorator and pick your client:
@triton_viz.trace("tracer")
# OR
@triton_viz.trace("profiler")
# OR
@triton_viz.trace("sanitizer")In addition, we added experimental AWS NKI support so you can debug Trainium kernels with just your CPU.
See the examples folder to see how to run your own kernels.
What's Changed
Visualizer
- Published a brand new visualizer with various features.
- Migrated to TypeScript and consolidated shared logic (#247).
Sanitizer
- Refactored the sanitizer to minimize overhead (#236).
- Added virtual memory support for better performance (#99)
- Enhanced PDB-style diagnostic report for OOB errors (#102)
- Added support for new operations: join, bitwise XOR/OR, atomic CAS, bitcast, max/min reduce (#173, #176, #178, #208).
- Multi-Layer Cache optimizations (#98, #215).
Profiler
- Added mask usage percentage tracking (#192)
- Added for-loop unroll tracking (#197, #214)
- Added block sampling feature to randomly sample k blocks from grid (#207)
- Added
buffer_loadapplicability checking (#184)
Homepage
- Published a homepage for the project (#249).
Core
- Added multithreading support (#230)
- Added NKI interpreter support (#206)
- Added
triton.autotunesupport (#119) - Added
triton.heuristicssupport (#216) - Added
triton_viz.__version__with git hash support (#238) - Fixed nested JIT function calls (#124)
Documentation & Examples
- Reorganized examples into sanitizer and visualizer directories (#248)
- Improved installation documentation (#253)
New Contributors
- @gujialiang123 made their first contribution in #179
- @latentCall145 made their first contribution in #199
Full Changelog: v2.0...v3.0