-
Notifications
You must be signed in to change notification settings - Fork 49
[wIP] Allow TriangularRFP for diagonal blocks of L [ci skip] #821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Some timings on my laptop (jl_6i3bqi) pkg> st
Status `/private/var/folders/d9/t94kd5m50rl1r3bchncn62sm0000gn/T/jl_6i3bqi/Project.toml`
[ff71e718] MixedModels v4.34.0
julia> dat = MixedModels.dataset(:insteval)
Arrow.Table with 73421 rows, 7 columns, and schema:
:s String
:d String
:dept String
:studage String
:lectage String
:service String
:y Int8
julia> f = @formula(y ~ 1 + service + (1|s) + (1|d) + (1|dept));
julia> m1 = fit(MixedModel, f, dat; progress=false);
julia> @be objective(updateL!($m1))
Benchmark: 20 samples with 1 evaluation
min 5.143 ms (38 allocs: 1.094 KiB)
median 5.172 ms (38 allocs: 1.094 KiB)
mean 5.176 ms (38 allocs: 1.094 KiB)
max 5.270 ms (38 allocs: 1.094 KiB)
# using the db/RFP branch
julia> m1 = fit(MixedModel, f, dat; progress=false);
julia> BlockDescription(m1)
rows: s d dept fixed
2972: Diagonal
1128: Sparse Diag/TrRFP
14: Dense Sparse/Dense Diag/Dense
3: Dense Dense Dense Dense
julia> @be objective(updateL!($m1))
Benchmark: 4 samples with 1 evaluation
26.432 ms (31 allocs: 768 bytes)
26.525 ms (31 allocs: 768 bytes)
26.669 ms (31 allocs: 768 bytes)
26.937 ms (31 allocs: 768 bytes)
julia> m2 = LinearMixedModel(f, dat; RFPthreshold=2000);
julia> BlockDescription(m2)
rows: s d dept fixed
2972: Diagonal
1128: Sparse Diag/Dense
14: Dense Sparse/Dense Diag/Dense
3: Dense Dense Dense Dense
julia> @be objective(updateL!($m2))
Benchmark: 20 samples with 1 evaluation
min 5.144 ms (28 allocs: 656 bytes)
median 5.173 ms (28 allocs: 656 bytes)
mean 5.173 ms (28 allocs: 656 bytes)
max 5.256 ms (28 allocs: 656 bytes) |
There may be a way of cutting down on the cost of the The position of the elements in the lower right triangle of Does this seem worth trying, @palday @ajinkya-k? I am willing to work on it and feel that it could be done relatively quickly but experience suggests that my "relatively quickly" may not be as quick as I imagine. |
seems worth trying for sure. |
I agree it's worth trying! |
|
Sorry I totally misunderstood the code (https://github.com/JuliaLinearAlgebra/RectangularFullPacked.jl/blob/b159bfabd981c46b8842ff1ad857aa95f6b52cad/src/hermitian.jl#L43) The setindex slowness is only for the case where the argument |
@ajinkya-k I should have been more clear that I was speaking of the case where L[2,1] is sparse. Using RFP for L[2,2] only makes sense if L[2,2] is very large and, by design, L[2,1] is even larger (because L[1,1] is as large as possible). If L[2,1] needs to be stored as a dense matrix you have already got a much bigger problem to contend with. |
No I think that was clear but I twisted myself into knots trying to read the file changes from the PR |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #821 +/- ##
==========================================
- Coverage 97.33% 96.73% -0.60%
==========================================
Files 36 36
Lines 3495 3588 +93
==========================================
+ Hits 3402 3471 +69
- Misses 93 117 +24
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
So I have a question about strategy. The bottleneck when using TriangularRFP for (usually) the [2,2] block of L which implies that the [2,1] block will be a SparseMatrixCSC is the When it is the lower triangle of I did a calculation of the number of updates in this
However, if we reorder the rows of
we get about 95% non-flipped and that certainly seems worth it. So, at what point should the levels of the second grouping factor be reordered? I think it will be rare to need to use RFP but, if L[2,1] is a Is this explanation sufficiently intelligible to be able to offer an opinion? |
@dmbates I think that change makes sense. I also wonder if this would slightly speed up things for the non RFP case --- I've certainly seen this type of optimization have big impacts in other code. If I'm thinking about this correctly, ordering by row-sums works out to ordering by something like "strength of crossing", right? We don't make any guarantees about the internal ordering of levels, so that's not an issue. |
For the time being I will do the simplified indexing part and not do the reordering of levels part. However, we can recommend reordering the levels before creating the model. |
L[diagind] = LowerTriangular(Ldi) | ||
end | ||
end | ||
return identity.(A), identity.(L) # does anyone remember what the `identity` is for?` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote the enclosed to do the reordering of levels in a using CategoricalArrays, DataFrames, MixedModels
"""
relevel_by_incidence!(v::CategoricalArray; rev=false)
Return `v` with the levels ordered by increasing (decreasing when `rev=true`) incidence.
"""
function relevel_by_incidence!(v::CategoricalArray; rev=false)
ra = refarray(droplevels!(v))
counts = zeros(Int, maximum(ra))
for i in ra
counts[i] += 1
end
levels!(v, levels(v)[sortperm(counts; rev)])
return v
end
dat = DataFrame(MixedModels.dataset(:insteval))
dat.d = relevel_by_incidence!(categorical(dat.d); rev=true)
m = LinearMixedModel(@formula(y ~ 1 + service + (1|s) + (1|d) + zerocorr(1+service|dept)), dat)
m.A[3].diag |
TriangularRFP
from https://github.com/JuliaLinearAlgebra/RectangularFullPacked.jlHermitian
and stored in the lower triangle (copyscaleinflate!
andrankUpdate!
) and when it is being lower triangular (aftercholUnblocked!
and in therdiv!
call).Matrix
objects for the dense case but that can trip you up.TriangularRFP
. This is more of a slowdown in therankUpdate!
method, which involves many calls tosetindex!
, rather than thecholUnblocked!
method.setindex!
because of the way the loops in therankUpdate!
work.RFPthreshold
, in theLinearMixedModel
constructor (but not yet inMixedModel
).[ci skip]
designation because I haven't updated all the tests to the more stringent requirements onHermitian
andTriangular
types.rankUpdate!
dosctring #813