You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summarizing device timing regardless of kernel shapes by default (#37)
* Initial version of unitrace
* Initial commits of unitrace
* Initial commits of unitrace
* Initial commits of unitrace
* Initial commits of unitrace
* Initial commits of unitrace
* Initial commits of unitrace
* Initial commits of unitrace
* Initial commits of unitrace
* Initial commits of unitrace
* Initial commits of unitrace
* Initial commits of unitrace
* Unhide Symbols Required By XPTI
* Initial commits of unitrace
* Initial commits of unitrace
* Summarizing device timing regardless of kernel shapes by default
* Summarizing device timing with out kernel shapes by default
* Summarizing device timing with out kernel shapes by default
* Summarizing device timing with out kernel shapes by default
---------
Co-authored-by: Schilling, Matthew <[email protected]>

135
136
136
137
In addition, it also outputs kernel information that helps to identify kernel performance issues that relate to occupancy caused by shared local memory usage and register spilling.

139
140
140
141
Here, the **"SLM Per Work Group"** shows the amount of shared local memory needed for each work group in bytes. This size can potentially affect occupancy.
141
142
142
143
The **"Private Memory Per Thread"** is the private memory allocated for each thread in bytes. A non-zero value indicates that one or more thread private variables are not in registers.
143
144
144
145
The **"Spill Memory Per Thread"** is the memory used for register spilled for each thread in bytes. A non-zero value indicates that one or more thread private variables are allocated in registers but are later spilled to memory.
145
146
147
+
By default, the kernel timing is summarized regardless of shapes. In case the kernel has different shapes, using **-v** along with **-d** is strongly recommended:
0 commit comments