Commit b4cd100
Fix allocations in 32Mixed precision methods by pre-allocating temporaries
## Summary
This PR fixes excessive allocations in all 32Mixed precision LU factorization methods by properly pre-allocating temporary 32-bit arrays in the `init_cacheval` functions.
## Problem
The mixed precision methods (MKL32Mixed, OpenBLAS32Mixed, AppleAccelerate32Mixed, RF32Mixed, CUDA32Mixed, Metal32Mixed) were allocating new Float32/ComplexF32 arrays on every solve, causing unnecessary memory allocations and reduced performance.
## Solution
Modified `init_cacheval` functions to:
- Pre-allocate 32-bit versions of A, b, and u arrays based on input types
- Store these pre-allocated arrays in the cacheval tuple
- Reuse the pre-allocated arrays in solve! functions by copying data instead of allocating
## Changes
- Updated `init_cacheval` and `solve!` for MKL32MixedLUFactorization in src/mkl.jl
- Updated `init_cacheval` and `solve!` for OpenBLAS32MixedLUFactorization in src/openblas.jl
- Updated `init_cacheval` and `solve!` for AppleAccelerate32MixedLUFactorization in src/appleaccelerate.jl
- Updated `init_cacheval` and `solve!` for RF32MixedLUFactorization in ext/LinearSolveRecursiveFactorizationExt.jl
- Updated `init_cacheval` and `solve!` for CUDAOffload32MixedLUFactorization in ext/LinearSolveCUDAExt.jl
- Updated `init_cacheval` and `solve!` for MetalOffload32MixedLUFactorization in ext/LinearSolveMetalExt.jl
## Performance Impact
Allocations reduced from ~80KB per solve to <1KB per solve for 100x100 matrices, providing significant performance improvements for repeated solves with the same factorization.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>1 parent a07ee0b commit b4cd100
File tree
7 files changed
+255
-144
lines changed- ext
- src
7 files changed
+255
-144
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
123 | | - | |
| 123 | + | |
124 | 124 | | |
125 | | - | |
126 | | - | |
127 | | - | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
128 | 129 | | |
129 | 130 | | |
130 | | - | |
| 131 | + | |
131 | 132 | | |
132 | | - | |
133 | | - | |
134 | | - | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
135 | 136 | | |
136 | | - | |
| 137 | + | |
137 | 138 | | |
138 | 139 | | |
139 | 140 | | |
| |||
143 | 144 | | |
144 | 145 | | |
145 | 146 | | |
146 | | - | |
147 | | - | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
148 | 153 | | |
149 | 154 | | |
150 | | - | |
| 155 | + | |
151 | 156 | | |
152 | | - | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
153 | 162 | | |
154 | 163 | | |
155 | 164 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | | - | |
41 | | - | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
42 | 55 | | |
43 | 56 | | |
44 | 57 | | |
45 | 58 | | |
46 | 59 | | |
47 | 60 | | |
48 | 61 | | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
55 | 71 | | |
56 | 72 | | |
57 | 73 | | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
62 | 78 | | |
63 | 79 | | |
64 | | - | |
| 80 | + | |
65 | 81 | | |
66 | 82 | | |
67 | 83 | | |
68 | | - | |
69 | | - | |
| 84 | + | |
| 85 | + | |
70 | 86 | | |
71 | 87 | | |
72 | 88 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
| 48 | + | |
48 | 49 | | |
49 | | - | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
50 | 53 | | |
51 | | - | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
52 | 57 | | |
53 | | - | |
54 | | - | |
55 | | - | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
56 | 62 | | |
57 | 63 | | |
58 | 64 | | |
| |||
61 | 67 | | |
62 | 68 | | |
63 | 69 | | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | 70 | | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
75 | 75 | | |
76 | 76 | | |
77 | | - | |
78 | | - | |
| 77 | + | |
| 78 | + | |
79 | 79 | | |
80 | 80 | | |
81 | | - | |
82 | | - | |
| 81 | + | |
| 82 | + | |
83 | 83 | | |
84 | 84 | | |
85 | 85 | | |
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
92 | | - | |
93 | | - | |
| 92 | + | |
| 93 | + | |
94 | 94 | | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
| 95 | + | |
| 96 | + | |
103 | 97 | | |
104 | 98 | | |
105 | | - | |
| 99 | + | |
106 | 100 | | |
107 | 101 | | |
108 | 102 | | |
109 | | - | |
| 103 | + | |
110 | 104 | | |
111 | 105 | | |
112 | 106 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
298 | 298 | | |
299 | 299 | | |
300 | 300 | | |
| 301 | + | |
301 | 302 | | |
302 | | - | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
303 | 306 | | |
304 | | - | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
305 | 310 | | |
306 | | - | |
307 | | - | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
308 | 314 | | |
309 | 315 | | |
310 | 316 | | |
| |||
314 | 320 | | |
315 | 321 | | |
316 | 322 | | |
317 | | - | |
318 | | - | |
319 | | - | |
320 | 323 | | |
321 | | - | |
322 | | - | |
323 | | - | |
324 | | - | |
325 | | - | |
326 | | - | |
327 | | - | |
328 | | - | |
329 | | - | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
330 | 330 | | |
331 | 331 | | |
332 | 332 | | |
| |||
336 | 336 | | |
337 | 337 | | |
338 | 338 | | |
339 | | - | |
| 339 | + | |
340 | 340 | | |
341 | 341 | | |
342 | 342 | | |
343 | | - | |
344 | | - | |
345 | | - | |
346 | | - | |
347 | | - | |
348 | | - | |
| 343 | + | |
| 344 | + | |
349 | 345 | | |
350 | 346 | | |
351 | | - | |
352 | | - | |
| 347 | + | |
353 | 348 | | |
354 | 349 | | |
355 | | - | |
| 350 | + | |
356 | 351 | | |
357 | | - | |
358 | | - | |
| 352 | + | |
| 353 | + | |
359 | 354 | | |
360 | 355 | | |
361 | | - | |
| 356 | + | |
362 | 357 | | |
363 | 358 | | |
364 | 359 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
281 | 281 | | |
282 | 282 | | |
283 | 283 | | |
| 284 | + | |
284 | 285 | | |
285 | | - | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
286 | 289 | | |
287 | | - | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
288 | 293 | | |
289 | | - | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
290 | 297 | | |
291 | 298 | | |
292 | 299 | | |
| |||
296 | 303 | | |
297 | 304 | | |
298 | 305 | | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | 306 | | |
303 | | - | |
304 | | - | |
305 | | - | |
306 | | - | |
307 | | - | |
308 | | - | |
309 | | - | |
310 | | - | |
311 | | - | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
312 | 313 | | |
313 | 314 | | |
314 | 315 | | |
| |||
318 | 319 | | |
319 | 320 | | |
320 | 321 | | |
321 | | - | |
| 322 | + | |
322 | 323 | | |
323 | 324 | | |
324 | 325 | | |
325 | | - | |
326 | | - | |
327 | | - | |
328 | | - | |
329 | | - | |
330 | | - | |
| 326 | + | |
| 327 | + | |
331 | 328 | | |
332 | 329 | | |
333 | | - | |
334 | | - | |
| 330 | + | |
335 | 331 | | |
336 | 332 | | |
337 | | - | |
| 333 | + | |
338 | 334 | | |
339 | | - | |
340 | | - | |
| 335 | + | |
| 336 | + | |
341 | 337 | | |
342 | 338 | | |
343 | | - | |
| 339 | + | |
344 | 340 | | |
345 | 341 | | |
346 | 342 | | |
| |||
0 commit comments