Commit d7351b3
authored
[compiler] Refactor BlockMatrix sparsity representations in type and lowering (#15163)
## Change Description
The bulk of the changes are in the files `BlockMatrixType.scala`,
`MatrixSparsity.scala`, and `LowerBlockMatrixIR.scala`. I suggest
starting with those, with the summaries below.
I acknowledge this is a big change containing a lot of non-trivial
logic. Please do ask for explanations and clarifications.
### BlockMatrixType and MatrixSparsity
`BlockMatrixType` has a `sparsity`, of type `BlockMatrixSparsity`, which
is either dense or a set of present blocks. In this PR I've moved the
sparsity representation to `MatrixSparsity`, which is a generic encoding
of the sparsity pattern of a sparse matrix. For a `BlockMatrixType`, the
sparsity will be a `nBlockRows` by `nBlockCols` `MatrixSparsity`.
Besides moving `BlockMatrixSparsity` to `MatrixSparsity`, and removing
references to blocks, the encoding of the sparsity is also streamlined.
Before, `BlockMatrixSparsity` held an array of coordinates of present
blocks, in an arbitrary order. Methods that need the present blocks in a
particular order (usually column major) would need to sort them, and we
also built a `Set` of present blocks to handle the `isPresent` query.
Now, `MatrixSparsity` (in the sparse case) enforces that the array of
present blocks is always in column major order. This simplifies some of
the logic, and lets us handle `isPresent` with binary search (similarly
for unions/intersections of sparsity, which now take advantage of the
ordering).
In addition, `MatrixSparsity` now knows its dimensions, which
`BlockMatrixSparsity` did not.
I've also gotten rid of all the complicated methods on the CSC
(compressed sparse column) encoding (e.g. `transposeCSCSparsity`).
Instead, I do all the transformations on the simple coordinate list
encoding, and convert to CSC as late as possible.
### LowerBlockMatrixIR
The changes in this file are mostly about a redesign of the
`BMSContexts` class. `BMSContexts` bundles together a runtime array of
context values (similar to the contexts for a `TableStage`), with a
representation of the sparsity pattern, which conceptually maps each
context value to the coordinates of its block.
Before, the sparsity pattern was also encoded by runtime values. In the
sparse case, this used runtime arrays `rowPos` and `rowIdx` using a CSC
encoding. Working with this encoding created a lot of non-trivial
runtime logic. But sparsity is completely known at compile time, so all
this runtime logic was unnecessary.
There are still cases where we need to work with a CSC encoding of the
sparsity at runtime. These cases are now handled by the class
`DynamicBMSContexts`, which is essentially the same as the old
`BMSContexts`, but now has a minimal interface of three methods.
Importantly, these methods are only about consuming a block matrix, not
transforming to a new one (like transpose).
The new `BMSContexts` now pairs a statically known `MatrixSparsity` with
a `DynamicBMSContexts`. The big change from before is that all methods
that produce a new `BMSContexts` now handle all the sparsity logic at
compile time, and simply embed the new sparsity in the IR using
literals. This typically requires us to also embed an array of ints
mapping from old to new positions in the contexts array, which we use at
runtime to reorder the contexts as needed.
## Security Assessment
- This change cannot impact the Hail Batch instance as deployed by Broad
Institute in GCP1 parent 0e29386 commit d7351b3
File tree
17 files changed
+1150
-958
lines changed- hail
- hail
- src/is/hail
- expr/ir
- functions
- lowering
- linalg
- types
- physical
- virtual
- utils/richUtils
- test/src/is/hail
- expr/ir
- linalg
- python/test/hail/methods
17 files changed
+1150
-958
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
113 | 113 | | |
114 | 114 | | |
115 | 115 | | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
116 | 120 | | |
117 | 121 | | |
118 | 122 | | |
| |||
350 | 354 | | |
351 | 355 | | |
352 | 356 | | |
353 | | - | |
| 357 | + | |
354 | 358 | | |
355 | 359 | | |
356 | 360 | | |
| |||
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| |||
135 | 135 | | |
136 | 136 | | |
137 | 137 | | |
138 | | - | |
139 | | - | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
140 | 143 | | |
141 | 144 | | |
142 | 145 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
512 | 512 | | |
513 | 513 | | |
514 | 514 | | |
| 515 | + | |
| 516 | + | |
515 | 517 | | |
516 | 518 | | |
517 | 519 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| |||
2340 | 2340 | | |
2341 | 2341 | | |
2342 | 2342 | | |
2343 | | - | |
| 2343 | + | |
2344 | 2344 | | |
2345 | 2345 | | |
2346 | 2346 | | |
| |||
2353 | 2353 | | |
2354 | 2354 | | |
2355 | 2355 | | |
2356 | | - | |
| 2356 | + | |
2357 | 2357 | | |
2358 | 2358 | | |
2359 | 2359 | | |
| |||
2510 | 2510 | | |
2511 | 2511 | | |
2512 | 2512 | | |
2513 | | - | |
| 2513 | + | |
2514 | 2514 | | |
2515 | 2515 | | |
2516 | 2516 | | |
| |||
Lines changed: 48 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
311 | 311 | | |
312 | 312 | | |
313 | 313 | | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
314 | 362 | | |
315 | 363 | | |
316 | 364 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
815 | 815 | | |
816 | 816 | | |
817 | 817 | | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
818 | 843 | | |
819 | 844 | | |
820 | 845 | | |
| |||
0 commit comments