Commit 7adf3b7
authored
NvTensorRTRTx: Enable CUDA graph via config and fix attention_mask shape handling (#1594)
- Add option to enable CUDA graph for NvTensorRTRTx EP through provider
config.
- Fix handling of attention_mask shapes when `enable_cuda_graph` is
false for NvTensorRTRTx:
- When `past_present_share_buffer` (in place kv cache) is enabled,
NvTensorRTRTx expects attention_mask shape as `[b, max_seq_len]` with
masking applied. Previously, these shapes were only sent when both
`past_present_share_buffer` and graph capture were enabled. This PR
ensures the correct shape is passed to TRT for in-place KV cache,
aligning with expected behavior.
@baijumeswani for review1 parent 070a034 commit 7adf3b7
File tree
5 files changed
+31
-11
lines changed- src
- models
5 files changed
+31
-11
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
828 | 828 | | |
829 | 829 | | |
830 | 830 | | |
831 | | - | |
| 831 | + | |
832 | 832 | | |
833 | 833 | | |
834 | 834 | | |
| |||
846 | 846 | | |
847 | 847 | | |
848 | 848 | | |
849 | | - | |
| 849 | + | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
850 | 855 | | |
851 | 856 | | |
852 | 857 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
276 | 276 | | |
277 | 277 | | |
278 | 278 | | |
279 | | - | |
| 279 | + | |
280 | 280 | | |
281 | 281 | | |
282 | 282 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
541 | 541 | | |
542 | 542 | | |
543 | 543 | | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
544 | 547 | | |
545 | 548 | | |
546 | 549 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
144 | | - | |
| 144 | + | |
145 | 145 | | |
146 | 146 | | |
147 | 147 | | |
| |||
154 | 154 | | |
155 | 155 | | |
156 | 156 | | |
157 | | - | |
| 157 | + | |
158 | 158 | | |
159 | 159 | | |
160 | 160 | | |
161 | 161 | | |
162 | 162 | | |
163 | | - | |
| 163 | + | |
164 | 164 | | |
165 | 165 | | |
166 | 166 | | |
167 | | - | |
| 167 | + | |
168 | 168 | | |
169 | 169 | | |
170 | | - | |
171 | | - | |
| 170 | + | |
| 171 | + | |
172 | 172 | | |
173 | 173 | | |
174 | 174 | | |
175 | 175 | | |
176 | | - | |
| 176 | + | |
177 | 177 | | |
178 | 178 | | |
179 | 179 | | |
| |||
256 | 256 | | |
257 | 257 | | |
258 | 258 | | |
259 | | - | |
| 259 | + | |
260 | 260 | | |
261 | 261 | | |
262 | 262 | | |
| |||
291 | 291 | | |
292 | 292 | | |
293 | 293 | | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
294 | 300 | | |
295 | 301 | | |
296 | 302 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
41 | 47 | | |
42 | 48 | | |
43 | 49 | | |
| |||
0 commit comments