diff --git a/blockedcholesky/.gitignore b/blockedcholesky/.gitignore new file mode 100644 index 0000000..ec33b16 --- /dev/null +++ b/blockedcholesky/.gitignore @@ -0,0 +1,4 @@ +/.luarc.json +settings.json +Manifest.toml +Manifest-v*.toml diff --git a/blockedcholesky/BlockedCholeskyMM.qmd b/blockedcholesky/BlockedCholeskyMM.qmd index 632f3c3..04bfb37 100644 --- a/blockedcholesky/BlockedCholeskyMM.qmd +++ b/blockedcholesky/BlockedCholeskyMM.qmd @@ -146,7 +146,7 @@ According to @eq-relcovfac, $\mbfSigma$ depends on both $\sigma$ and $\theta$, a However, we will blur that distinction and continue to write $\text{Var}(\mcB)=\mbfSigma_\theta$. Another technicality is that the *common scale parameter*, $\sigma$, could, in theory, be zero. -However, the only way for its estimate, $\widehat{\sigma}$, to be zero is for the fitted values from the fixed-effects only, $\mbfX\widehat{\mbfbeta}$, to be exactly equal to the observed data. +However, the only way for its estimate, $\widehat{\sigma}$, to be zero is for the fitted values from the fixed-effects only, $\mbfX\widehat{\mbfbeta}$, to be exactly equal to the observed response vector, $\mbfy$. This occurs only with data that have been (incorrectly) simulated without error. In practice, we can safely assume that $\sigma>0$. However, the estimated $\mbfLambda_\theta$, like $\mbfSigma_\theta$, can be singular. @@ -346,7 +346,7 @@ and taking the logarithm provides the estimate of $\sigma^2$, given $\mbftheta$, as $$ -\widehat{\sigma^2}=\frac{r_yy^2}{n}\ , +\widehat{\sigma^2}=\frac{r_{yy}^2}{n}\ , $$ {#eq-sigma-hat} which gives the *profiled log-likelihood*, @@ -359,7 +359,7 @@ n\left(1+\log\left(\frac{2\pi r_{yy}^2(\mbftheta)}{n}\right)\right)\ . $$ {#eq-profiled-log-likelihood} One of the interesting aspects of this formulation is that it is not necessary to solve for the conditional estimate of $\mbfbeta$ or the conditional modes of the random effects when evaluating the log-likelihood. -The two values needed for the log-likelihood evaluation, $2\log(|\mbfR_{ZZ}|)$ and $r_\mathbf{yy}^2$, are obtained directly from the diagonal elements of the Cholesky factor. +The two values needed for the log-likelihood evaluation, $2\log(|\mbfR_{ZZ}|)$ and $r_{yy}^2$, are obtained directly from the diagonal elements of the Cholesky factor. Furthermore, $\mbfOmega_{\theta}$ and, from that, the Cholesky factor, $\mbfR_{\theta}$, and the objective to be optimized can be evaluated for a given value of $\mbftheta$ from the Gram matrix of the columns of $\mbfZ$, $\mbfX$ and $\mbfy$, @@ -373,8 +373,8 @@ $$ {#eq-A} and $\mbfLambda_{\theta}$. -In [MixedModels.jl]{.pkg} the `LinearMixedModel` struct contains a blocked representation of $\mathbf{A}$ in the `A` field and a similarly structured lower-triangular blocked array in the `L` field. -Evaluation of the objective simply involves updating a representation of the relative covariance factor $\mbfLambda_\theta$ from $\mbftheta$, forming $\mbfOmega_\theta$ (@eq-bigCholfac) from $\mbfA$ and $\mbfLambda_\theta$ then evaluating its lower Cholesky factor $\mbfL_\theta$. +In [MixedModels.jl]{.pkg} the `LinearMixedModel` struct contains a blocked representation of the lower triangle of the symmetric matrix, $\mathbf{A}$, in the `A` field and a similarly structured lower-triangular blocked array in the `L` field. +Evaluation of the objective simply involves updating a representation of the relative covariance factor $\mbfLambda_\theta$ from $\mbftheta$, forming the lower triangle of the symmetric matrix $\mbfOmega_\theta$ (@eq-bigCholfac) from $\mbfA$ and $\mbfLambda_\theta$ then evaluating its lower Cholesky factor $\mbfL_\theta$. In @eq-A the blocking structure is according to all the random effects, followed by the fixed-effects followed by the response. However, as described in @bates.maechler.etal:2015[Section 3], the structure of $\mbfZ$, $\mbfZ^{\top}\mbfZ$ and $\mbfLambda_\theta$ is further subdivided into blocks according to the random-effects terms in the model formula, which we cover in the next section. @@ -534,7 +534,7 @@ In practice the response vector, $\mbfy$, is concatenated (`hcat`) to the fixed In the block structure summary if a block of $\mbfL$ and its corresponding block of $\mbfA$ have the same structure, the name of that structure is given once. This is the case for all of the blocks except the [2,2] block, which is diagonal in $\mbfA$ but dense in $\mbfL$, the [2,3] block, which is sparse in $\mbfA$ but dense in $\mbfL$, and the [3,3] block, which is block-diagonal in $\mbfA$ and dense in $\mbfL$. -As described in @bates.maechler.etal:2015[Section 3] the relative covariance factor, $\mbfLambda_\theta$ is block diagonal with the same blocking structure as the random effects parts of $\mbfA$ or $\mbfL$. +As described in @bates.maechler.etal:2015[Section 3] the relative covariance factor, $\mbfLambda_\theta$, is block diagonal with the same blocking structure as the random effects parts of $\mbfA$ or $\mbfL$. The $i$th diagonal block, $i=1,\dots,k$, consists of $q_i$ repetitions on the diagonal of a $\ell_i\times\ell_i$ lower-triangular template matrix $\mbfT_i$. For a scalar random effects term (i.e. $\ell_i=1$) its block in $\mbfLambda_\theta$ is a multiple of the $q_i\times q_i$ identity matrix. @@ -553,10 +553,72 @@ which maps to the template matrices m1.λ ``` +The full $\mbfLambda_\theta$ matrix, of size $4128\times 4128$ is block diagonal in three blocks, +corresponding to the three grouping factors; `s`, `d`, and `dept`, for the random effects. +Because the first two matrices in `m1.λ` are of size $1\times 1$, the first two diagonal blocks will be $\theta_1 * \mbfI_{2972}$ and $\theta_2 * \mbfI_{1128}$. +The third diagonal block of $\mbfLambda_\theta$ is also a diagonal matrix, because + +```{julia} +last(m1.λ) +``` + +is diagonal. +This diagonal third block on the diagonal of $\mbfLambda_\theta$, of size $28\times 28$, consists of 14 repetitions of this pattern on the diagonal. + +### Constrained versus unconstrained optimization of the objective + +Because $\mbfLambda_\theta$ is a scaled version of the lower Cholesky factor of the covariance matrix, $\mbfSigma_\theta$, of the 4128-dimensional random effects random variable $\mcB$ it is customary to constrain its diagonal elements to be non-negative, which, in the case of model `m1`, is equivalent to requiring that all four elements of $\mbftheta$ be non-negative. +Thus the optimization of the objective with respect to $\mbftheta$ in [lme4]{.pkg} and in early versions of [MixedModels.jl]{.pkg} was performed using a version of Powell's BOBYQA (Bounded Optimisation BY Quadratic Approximation) algorithm (@powell2009bobyqa). + +However, it is not necessary to impose the "non-negativity on the diagonal" constraint on $\mbftheta$. +The objective is well-defined for negative values of components of $\mbftheta$ that occur on the diagonal of $\mbfLambda_\theta$. +Flipping the sign of one of these "diagonal" elements of $\mbftheta$, e.g. + +```{julia} +(objective(m1), objective!(m1, m1.θ .* [1, 1, -1, 1])) +``` + +produces the same value of the objective. + +If some of the elements of $\mbftheta$ occur off the diagonal of $\mbfLambda_\theta$, as in the model + +```{julia} +form2 = @formula(y ~ 1 + service + (1|s) + (1|d) + (1 + service|dept)); +m2 = fit(MixedModel, form2, dat; progress=false); +print(m2) +``` + +which has 3 parameters in the last $\lambda$ matrix + +```{julia} +last(m2.λ) +``` + +then flipping the sign of a diagonal element requires flipping the signs of all the parameters in that column to maintain the same value of the objective function. + +```{julia} +(objective(m2), objective!(m2, m2.θ .* [1, 1, -1, -1, 1])) +``` + +Because an unconstrained optimization is generally simpler than a constrained or bounded optimization we have removed the bounds on the elements of $\mbftheta$ in recent versions of [MixedModels.jl]{.pkg}. +In the event that the optimizer converges to a parameter vector with a negative diagonal element in one of the $\lambda$ template matrices, we flip the sign of that element and any other elements of $\mbftheta$ in the same column after the optimization. +Thus we create a "canonical" form of the converged parameter vector, $\hat{\mbftheta}$, corresponding to a $\mbfLambda_\theta$ with non-negative values on the diagonal but without having the burden of performing a bounded or constrained optimization. + +### Analysis of deviance for comparing nested models + +The model `m1` is a sub-model of `m2`, corresponding to the constraint $\theta_4 = 0$ in model `m2`. +Such models can be compared via a likelihood ratio test + +```{julia} +lrtest(m1, m2) +``` + +We mention this here because it provides a guideline on how precisely we wish to determine the value of the objective in the optimization, as discussed in @sec-reproducibility. + ### Special role of the [1,1] block {#sec-one-one-block} The reason for ordering the random-effects terms so that the [1,1] block of $\mbfA$ is as large as possible is because the diagonal or block diagonal structure of this block is preserved in the [1,1] block of $\mbfL$. -To illustrate the importance of choosing the first block of the augmented matrix $\mbfA$, we fit the same model with (a) the [1,1] block corresponding to the random effect term for the instructor (`1 | d`) with 1128 levels, and (b) the [1,1] block corresponding to the random effect term for students (`(1 | s)` with 2972 levels). +To illustrate the importance of choosing the first block of the augmented matrix $\mbfA$, we fit the same model with (a) the [1,1] block corresponding to the random effect term for the instructor (`(1 | d)` with 1128 levels), and (b) the [1,1] block corresponding to the random effect term for students (`(1 | s)` with 2972 levels). ```{julia} @@ -689,10 +751,10 @@ CairoMakie.activate!(; type="pdf"); ![Sparsity pattern of the lower triangular Cholesky factor when the order of the random effects is not carefully chosen (left) and when the order is carefully chosen (right).](lmm_side_by_s.png){#fig-blk width=100% fig-pos='!h'} --> -@fig-a shows the non-zero elements of the augmented matrix, $\mbfA$, when (a) the first block corresponsds to the random effect terms for the instructor (`(1 | d)` with 1128 levels), and (b) the first block corresponds to the student (`(1 | s)` with 2972 levels), and @fig-blk shows the non-zero elements for the corresponding lower-triangular Cholesky factor. -When performing a block-wise Cholesky factorization, the diagonal structure of the $[1,1]$ block from the input matrix is preserved in the Cholesky factor, but the diagonal structure for the subsequent diagonal blocks (e.g [2,2]) is not preserved. +@fig-a shows the non-zero elements of the augmented matrix, $\mbfA$, when (a) the first block corresponds to the random effect term for the instructor (`(1 | d)` with 1128 levels), and (b) the first block corresponds to the student (`(1 | s)` with 2972 levels), and @fig-blk shows the non-zero elements for the corresponding lower-triangular Cholesky factor. +When performing a block-wise Cholesky factorization, the diagonal structure of the $[1,1]$ block from the input matrix is preserved in the Cholesky factor, but the diagonal structure for the subsequent diagonal blocks (e.g [2,2]) is generally not preserved. While the models are theoretically identical, the latter is computationally more efficient as the largest diagonal block preserves its structure. -In fact, for the `insteval` dataset, the former order leads to approximately 4.5 million non-zero entries in the Cholesky factor, whereas choosing the random effect with the largest number of levels leads to a Cholesky factor with approximately 775,000 non-zero elements, which is an order of magnitude lower. +In fact, for the `insteval` dataset, the former order leads to approximately 4.5 million non-zero entries in the Cholesky factor, whereas choosing the first random effect to have the largest number of levels leads to a Cholesky factor with approximately 775,000 non-zero elements, which is an order of magnitude lower. This translates to an order of magnitude difference between the runtimes for fitting the two versions of the model. We demonstrate this with the `insteval` dataset in Appendix -@sec-app-re where fitting the model with the two blocks swapped is almost twenty times slower. @@ -700,6 +762,36 @@ Because the [1,1] block of $\mbfL$ is diagonal and the [1,1] block of $\mbfLambd For example, the [2,1] block of $\mbfA$ is stored as a sparse matrix because only about 2% (73421) of the $2792\times1128=3352416$ elements of that matrix are non-zero. The [2,1] block of $\mbfL$ has exactly the same sparsity structure because each element of that block of $\mbfL$ is a scalar multiple of the corresponding element of that block of $\mbfA$. +### Reproducibility of values of the objective function {#sec-reproducibility} + +The theoretical development in @sec-profiled describes a method for evaluation of the objective function, which is negative twice the profiled log-likelihood, for a given model and value of the relative covariance parameters, $\mbftheta$. +Determining the maximum likelihood estimates requires numerical minimization of this objective, for which we use one of two implementations of Powell's BOBYQA (Bounded Optimisation BY Quadratic Approximation) algorithm (@powell2009bobyqa). +By default the `fit` method for a `MixedModel` uses the implementation from the [NLopt.jl]{.pkg} package. +An alternative fitting function, `prfit`, uses the BOBYQA implementation from the [PRIMA.jl]{.pkg} package. + +Even though the algorithm is the same in the two implementations we can expect slight differences in the converged parameter estimates because of differences in the implementations, convergence criteria, etc. +Fortunately the nature of the objective allows for an assessment of whether differences in converged values are of importance. +Differences in this objective, negative twice the log-likelihood, would be equal to differences in the deviance, if there was a clean definition of the deviance for these models, because the deviance is the negative log-likelihood minus a constant, and the constant will cancel out when considering the difference in the deviances of two models. + +In an analysis of deviance this difference in the deviances of a null and an alternative model is assumed to have a $\chi^2$ distribution with degrees of freedom equal to the number of parameters constrained from the full model to generate the null model. +The point we can carry away from this is that, because of the range of the $\chi^2$ distribution, small differences in the log-likelihood, say less than $10^{-4}$, will not affect the conclusion from the test. +Thus we can consider different model fits representing the same model to be equivalent if the difference in the objective at convergence is less than, say, $10^{-4}$, which is why the value of the objective and related quantities are rounded to four digits after the decimal place in the printed summary of a model. + +Moreover, the nature of floating point computation can result in slight differences in the evaluated objective on different processor architectures or different numerical libraries such as BLAS/Lapack implementations. +The reason for small differences in objective values with different BLAS/Lapack implementations is that "accelerated" BLAS often rearrange the order of operations in an expression so as to minimize cache misses or to use multiple threads of execution. +Because floating point addition, in particular, is not necessarily transitive (i.e. the floating point sum $(a + b) + c$ is not guaranteed to be the same as $a + (b + c)$) reordering arithmetic operations in linear algebra calculations can give slightly different numerical results, which is accepted as part of "the cost of doing business" in large scale numerical computation. + +The starting values for the parameters in a model like that fit to the `insteval` data are chosen so that the initial $\mbfLambda$ is the identity matrix. +Even with this parameter value, for which matrix multiplications by $\mbfLambda$ or its transpose will be exact, the calculated objective value, stored as the `optsum.finitial` property, can vary slightly according to the processor and BLAS implementation used. +As shown in @sec-app-env, the results shown here were evaluated on an Apple M-series processor using macOS with the AppleAccelerate version of the BLAS, producing + +```{julia} +m1.optsum.finitial +``` + +On other machines with Intel x86-64 processors and OpenBLAS the same code produced an initial objective value of 242059.14387186486 while using the MKL implementation of the BLAS produced 242059.14387186494 . +Because these values agree to nine decimal places the differences are negligible but they will compound during optimization, which in this case involved about 175 to 200 evaluations and the final parameter estimates will be slightly different on different machines with potentially different BLAS implementations. + ## Application to data from an observational study {#sec-application} We consider an observational study -- ratings of movies by users of [movielens.org](https://movielens.org) made available at the [grouplens.org download site](https://grouplens.org/datasets/movielens) [@Harper2016] as *MovieLens 32M Dataset* (`ml-32m`), consisting of roughly 32 million ratings of over 80,000 movies by over 200,000 users. @@ -908,7 +1000,6 @@ The time required to fit a model to large data sets is dominated by the time req The time for one evaluation is given in the `time per eval` column of @tbl-sizespeed. Also given is the number of evaluations to convergence, `n eval`, and the total time to fit the model, `time`. The reason for considering `evtime` in addition to `fittime` and `n eval` is because the `time per eval` for one model, relative to other models, is reasonably stable across computers whereas `n eval`, and hence, `time`, can be affected by seemingly trivial variations in function values resulting from different implementations of low-level calculations, such as the BLAS (Basic Linear Algebra Subroutines). -(Floating point arithmetic is not guaranteed to be associative and custom BLAS implementations may rearrange the order of operations when evaluating products and sums.) That is, we can't expect to reproduce `n eval` exactly when fitting the same model on different computers or with slightly different versions of software; but we can expect the pattern in `time per eval` with respect to the user and movie cutoffs to be reproducible. diff --git a/blockedcholesky/Project.toml b/blockedcholesky/Project.toml index b9b39ba..8d0a4f2 100644 --- a/blockedcholesky/Project.toml +++ b/blockedcholesky/Project.toml @@ -9,18 +9,19 @@ DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" MKL = "33e6dc65-8f57-5167-99aa-e5a354878fb2" MKL_jll = "856f044c-d86e-5d09-b602-aeab76dc8ba7" MixedModels = "ff71e718-51f3-5ec2-a782-8ffcbfa3c316" +PRIMA = "0a7d04aa-8ac2-47b3-b7a7-9dbd6ad661ed" PrettyTables = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d" Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7" SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf" [compat] -AlgebraOfGraphics = "0.10.4" +AlgebraOfGraphics = "0.10,0.11" Arrow = "2.8" CSV = "0.10" -CairoMakie = "0.13.4" +CairoMakie = "0.13,0.14,0.15" Chairmarks = "1.3" DataFrames = "1.7" -MixedModels = "4.34" -PrettyTables = "2.4" +MixedModels = "4,5" +PrettyTables = "2" Printf = "1" julia = "1.10" diff --git a/blockedcholesky/data/evaluation.csv b/blockedcholesky/data/evaluation.csv new file mode 100644 index 0000000..d680d08 --- /dev/null +++ b/blockedcholesky/data/evaluation.csv @@ -0,0 +1,5 @@ +os,arch,BLAS,initial +macos,aarch64,OpenBLAS,242059.14387186497 +macos,aarch64,AppleAccelerate,242059.14387186497 +linux,x86_64,OpenBLAS,242059.14387186486 +linux,x86_64,MKL,242059.14387186494 \ No newline at end of file diff --git a/blockedcholesky/eds/code.txt b/blockedcholesky/eds/code.txt new file mode 100644 index 0000000..b9f5f89 --- /dev/null +++ b/blockedcholesky/eds/code.txt @@ -0,0 +1,221 @@ + Resolving package versions... + Installed Conda ───────────── v1.10.2 + Installed VersionParsing ──── v1.3.0 + Installed WinReg ──────────── v1.0.0 + Installed RCall ───────────── v0.14.8 + Installed CategoricalArrays ─ v0.10.8 + Updating `~/5816/jss-replication/Project.toml` + [6f49c342] + RCall v0.14.8 + Updating `~/5816/jss-replication/Manifest.toml` + [324d7699] + CategoricalArrays v0.10.8 + [8f4d0f93] + Conda v1.10.2 + [6f49c342] + RCall v0.14.8 + [81def892] + VersionParsing v1.3.0 + [1b915085] + WinReg v1.0.0 + Building Conda → `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/b19db3927f0db4151cb86d073689f2428e524576/build.log` + Building RCall → `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/815f2e4b52e377eb7ae21c8235bd3e2e3e0bfd9e/build.log` +Precompiling project... + 1009.3 ms ✓ VersionParsing + 875.4 ms ✓ WinReg + 925.1 ms ✓ Conda + 2643.4 ms ✓ CategoricalArrays + 547.0 ms ✓ CategoricalArrays → CategoricalArraysStructTypesExt + 557.5 ms ✓ CategoricalArrays → CategoricalArraysSentinelArraysExt + 622.8 ms ✓ CategoricalArrays → CategoricalArraysJSONExt + 11557.8 ms ✓ RCall → RCallAxisArraysExt + 8 dependencies successfully precompiled in 12 seconds. 351 already precompiled. +\begin{tabular}{rrrrrr} + \hline + \textbf{variable} & \textbf{min} & \textbf{max} & \textbf{mean} & \textbf{nunique} & \textbf{eltype} \\\hline + s & S0001 & S2972 & - & 2972 & String \\ + d & I0001 & I2160 & - & 1128 & String \\ + dept & D01 & D15 & - & 14 & String \\ + studage & 2 & 8 & - & 4 & String \\ + lectage & 1 & 6 & - & 6 & String \\ + service & 0.0 & 1.0 & 0.43 & - & Float64 \\ + y & 1 & 5 & 3.21 & - & Int8 \\\hline +\end{tabular} +Linear mixed model fit by maximum likelihood + y ~ 1 + service + (1 | d) + (1 | s) + (1 | dept) + (0 + service | dept) + logLik -2 logLik AIC AICc BIC + -118824.3009 237648.6017 237662.6017 237662.6033 237727.0295 + +Variance components: + Column Variance Std.Dev. Corr. +s (Intercept) 0.1052912 0.3244861 +d (Intercept) 0.2624290 0.5122782 +dept (Intercept) 0.0025808 0.0508012 + service 0.0233072 0.1526671 . +Residual 1.3850110 1.1768649 + Number of obs: 73421; levels of grouping factors: 2972, 1128, 14 + + Fixed-effects parameters: +───────────────────────────────────────────────────── + Coef. Std. Error z Pr(>|z|) +───────────────────────────────────────────────────── +(Intercept) 3.27765 0.0235043 139.45 <1e-99 +service -0.0507681 0.043913 -1.16 0.2476 +─────────────────────────────────────────────────────DataType[ReMat{Float64, 1}, ReMat{Float64, 1}, ReMat{Float64, 2}][2972, 1128, 14][2972, 1128, 28][0.27572075999408585, 0.43529059514658697, 0.043166525487089165, 0.12972357349206423]Linear mixed model fit by REML + y ~ 1 + service + (1 | d) + (1 | s) + (1 | dept) + (0 + service | dept) + REML criterion at convergence: 237658.60945204942 + +Variance components: + Column Variance Std.Dev. Corr. +s (Intercept) 0.1053184 0.3245279 +d (Intercept) 0.2624392 0.5122882 +dept (Intercept) 0.0030504 0.0552307 + service 0.0256191 0.1600596 . +Residual 1.3850028 1.1768614 + Number of obs: 73421; levels of grouping factors: 2972, 1128, 14 + + Fixed-effects parameters: +───────────────────────────────────────────────────── + Coef. Std. Error z Pr(>|z|) +───────────────────────────────────────────────────── +(Intercept) 3.27771 0.0242461 135.19 <1e-99 +service -0.0502827 0.045775 -1.10 0.2720 +─────────────────────────────────────────────────────Linear mixed model fit by maximum likelihood + y ~ 1 + service + (1 | d) + (1 | s) + (1 | dept) + (0 + service | dept) + logLik -2 logLik AIC AICc BIC + -118824.3008 237648.6016 237662.6016 237662.6032 237727.0294 + +Variance components: + Column Variance Std.Dev. Corr. +d (Intercept) 0.2624226 0.5122720 +s (Intercept) 0.1052959 0.3244934 +dept (Intercept) 0.0025799 0.0507932 + service 0.0233976 0.1529627 . +Residual 1.3850090 1.1768641 + Number of obs: 73421; levels of grouping factors: 1128, 2972, 14 + + Fixed-effects parameters: +───────────────────────────────────────────────────── + Coef. Std. Error z Pr(>|z|) +───────────────────────────────────────────────────── +(Intercept) 3.27765 0.0235031 139.46 <1e-99 +service -0.0507437 0.0439869 -1.15 0.2487 +─────────────────────────────────────────────────────| **movie cutoff** | **user cutoff** | **ratings** | **users** | **movies** | **model (GiB)** | **L[2,2] (GiB)** | **n eval** | **time (s)** | **time per eval (s)** | +|-----------------:|----------------:|------------:|----------:|-----------:|----------------:|-----------------:|-----------:|-------------:|----------------------:| +| 1 | 20 | 32000204 | 200948 | 84432 | 56.25 | 53.11 | 26 | 18740.1 | 667.46 | +| 2 | 20 | 31981597 | 200948 | 65825 | 35.41 | 32.28 | 19 | 7091.55 | 355.21 | +| 5 | 20 | 31921467 | 200948 | 43884 | 17.47 | 14.35 | 26 | 3816.29 | 142.37 | +| 10 | 20 | 31842705 | 200948 | 31961 | 10.72 | 7.61 | 25 | 2165.85 | 82.71 | +| 15 | 20 | 31777786 | 200948 | 26428 | 8.31 | 5.2 | 24 | 1619.62 | 64.59 | +| 20 | 20 | 31725920 | 200948 | 23350 | 7.16 | 4.06 | 21 | 1253.51 | 57.18 | +| 50 | 20 | 31498689 | 200947 | 16034 | 4.99 | 1.92 | 28 | 1212.59 | 42.09 | +| 1 | 40 | 30433400 | 144848 | 84205 | 55.8 | 52.83 | 23 | 15952.8 | 663.37 | +| 2 | 40 | 30415014 | 144848 | 65819 | 35.25 | 32.28 | 32 | 11565.4 | 356.29 | +| 5 | 40 | 30355623 | 144848 | 43884 | 17.31 | 14.35 | 38 | 5496.85 | 139.71 | +| 10 | 40 | 30277758 | 144848 | 31961 | 10.56 | 7.61 | 21 | 1840.27 | 83.44 | +| 15 | 40 | 30213583 | 144848 | 26428 | 8.15 | 5.2 | 21 | 1428.19 | 64.85 | +| 20 | 40 | 30162330 | 144848 | 23350 | 7.0 | 4.06 | 17 | 1026.09 | 57.08 | +| 50 | 40 | 29938038 | 144848 | 16034 | 4.83 | 1.92 | 24 | 1046.22 | 41.57 | +| 1 | 80 | 27569316 | 94380 | 83897 | 55.13 | 52.44 | 23 | 16155.2 | 677.07 | +| 2 | 80 | 27551218 | 94380 | 65799 | 34.94 | 32.26 | 22 | 8541.43 | 392.76 | +| 5 | 80 | 27492886 | 94380 | 43884 | 17.03 | 14.35 | 23 | 3564.96 | 142.58 | +| 10 | 80 | 27416310 | 94380 | 31961 | 10.28 | 7.61 | 23 | 2054.17 | 85.29 | +| 15 | 80 | 27353366 | 94380 | 26428 | 7.87 | 5.2 | 26 | 1785.08 | 65.72 | +| 20 | 80 | 27303019 | 94380 | 23350 | 6.72 | 4.06 | 43 | 2527.21 | 57.72 | +| 50 | 80 | 27083536 | 94380 | 16034 | 4.55 | 1.92 | 20 | 873.98 | 41.38 | +Status `~/5816/jss-replication/Project.toml` +⌅ [cbdf2221] AlgebraOfGraphics v0.10.8 + [13e28ba4] AppleAccelerate v0.4.1 + [69666777] Arrow v2.8.0 + [336ed68f] CSV v0.10.15 +⌅ [13f3f980] CairoMakie v0.13.10 + [0ca39b1e] Chairmarks v1.3.1 + [a93c6f00] DataFrames v1.7.0 + [33e6dc65] MKL v0.9.0 + [ff71e718] MixedModels v4.35.2 + [08abe8d2] PrettyTables v2.4.0 + [6f49c342] RCall v0.14.8 + [856f044c] MKL_jll v2025.0.1+1 + [de0858da] Printf v1.11.0 + [2f01184e] SparseArrays v1.11.0 +Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated` +Linear mixed model fit by maximum likelihood ['lmerMod'] +Formula: y ~ 1 + service + (1 | s) + (1 | d) + (1 | dept) + (0 + service | + dept) + Data: dat +Control: ctrl + + AIC BIC logLik -2*log(L) df.resid + 237662.6 237727.0 -118824.3 237648.6 73414 + +Scaled residuals: + Min 1Q Median 3Q Max +-2.99960 -0.74769 0.04009 0.77283 3.11346 + +Random effects: + Groups Name Variance Std.Dev. + s (Intercept) 0.10530 0.32449 + d (Intercept) 0.26243 0.51228 + dept (Intercept) 0.00258 0.05079 + dept.1 service 0.02340 0.15297 + Residual 1.38501 1.17686 +Number of obs: 73421, groups: s, 2972; d, 1128; dept, 14 + +Fixed effects: + Estimate Std. Error t value +(Intercept) 3.27765 0.02350 139.457 +service -0.05074 0.04399 -1.154 + Family: gaussian ( identity ) +Formula: +y ~ 1 + service + (1 | s) + (1 | d) + (1 | dept) + (0 + service | dept) +Data: dat + + AIC BIC logLik -2*log(L) df.resid + 237662.6 237727.0 -118824.3 237648.6 73414 + +Random effects: + +Conditional model: + Groups Name Variance Std.Dev. + s (Intercept) 0.10530 0.32449 + d (Intercept) 0.26243 0.51228 + dept (Intercept) 0.00258 0.05079 + dept.1 service 0.02340 0.15296 + Residual 1.38501 1.17686 +Number of obs: 73421, groups: s, 2972; d, 1128; dept, 14 + +Dispersion estimate for gaussian family (sigma^2): 1.39 + +Conditional model: + Estimate Std. Error z value Pr(>|z|) +(Intercept) 3.27765 0.02351 139.39 <2e-16 *** +service -0.05074 0.04408 -1.15 0.25 +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 +Unit: seconds + expr min lq mean median uq max neval cld + lmer 13.66375 13.79771 14.35411 13.96587 14.11864 16.61282 6 a + glmmTMB 15.96895 16.00377 16.62984 16.59592 17.08531 17.52916 6 b +[1] "R version 4.5.0 (2025-04-11)" +[1] ‘1.1.37’ +[1] ‘1.1.11’ +Julia Version 1.11.5 +Commit 760b2e5b739 (2025-04-14 06:53 UTC) +Build Info: + Official https://julialang.org/ release +Platform Info: + OS: Linux (x86_64-linux-gnu) + CPU: 4 × Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz + WORD_SIZE: 64 + LLVM: libLLVM-16.0.6 (ORCJIT, skylake-avx512) +Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores) +Status `~/5816/jss-replication/Project.toml` +⌅ [cbdf2221] AlgebraOfGraphics v0.10.8 + [13e28ba4] AppleAccelerate v0.4.1 + [69666777] Arrow v2.8.0 + [336ed68f] CSV v0.10.15 +⌅ [13f3f980] CairoMakie v0.13.10 + [0ca39b1e] Chairmarks v1.3.1 + [a93c6f00] DataFrames v1.7.0 + [33e6dc65] MKL v0.9.0 + [ff71e718] MixedModels v4.35.2 + [08abe8d2] PrettyTables v2.4.0 + [6f49c342] RCall v0.14.8 + [856f044c] MKL_jll v2025.0.1+1 + [de0858da] Printf v1.11.0 + [2f01184e] SparseArrays v1.11.0 +Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated` diff --git a/blockedcholesky/eds/info.txt b/blockedcholesky/eds/info.txt new file mode 100644 index 0000000..9449cad --- /dev/null +++ b/blockedcholesky/eds/info.txt @@ -0,0 +1,26 @@ +Julia Version 1.11.5 +Commit 760b2e5b739 (2025-04-14 06:53 UTC) +Build Info: + Official https://julialang.org/ release +Platform Info: + OS: Linux (x86_64-linux-gnu) + CPU: 4 × Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz + WORD_SIZE: 64 + LLVM: libLLVM-16.0.6 (ORCJIT, skylake-avx512) +Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores) +Status `~/5816/jss-replication/Project.toml` +⌅ [cbdf2221] AlgebraOfGraphics v0.10.8 + [13e28ba4] AppleAccelerate v0.4.1 + [69666777] Arrow v2.8.0 + [336ed68f] CSV v0.10.15 +⌅ [13f3f980] CairoMakie v0.13.10 + [0ca39b1e] Chairmarks v1.3.1 + [a93c6f00] DataFrames v1.7.0 + [33e6dc65] MKL v0.9.0 + [ff71e718] MixedModels v4.35.2 + [08abe8d2] PrettyTables v2.4.0 + [6f49c342] RCall v0.14.8 + [856f044c] MKL_jll v2025.0.1+1 + [de0858da] Printf v1.11.0 + [2f01184e] SparseArrays v1.11.0 +Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated` diff --git a/blockedcholesky/eds/response.txt b/blockedcholesky/eds/response.txt new file mode 100644 index 0000000..8e4512c --- /dev/null +++ b/blockedcholesky/eds/response.txt @@ -0,0 +1,98 @@ + +Journal of Statistical Software +⟵ Back to Submissions +5816 / Bates et al. / Mixed-model Log-likelihood Evaluation Via a Blocked Cholesky Factorization + + Submission + Review + Copyediting + Production + +Submission Files + + Search + +Name Date Component +Settings +35560 jss-replication.tar.gz + May 16, 2025 Replication Materials +Settings +35561 BlockedCholeskyMM+jss.pdf + May 16, 2025 PDF Manuscript +Settings +35562 MixedModels.jl.tar.gz + May 16, 2025 Software + + Download All Files + +Pre-Review Discussions + + Add discussion + +Name From Last Reply Replies Closed +Comments for the Author jstatsoft +2025-05-21 05:46 PM - 0 +Settings +Configuration information on JSS replication run + dmbates +2025-06-17 04:44 PM jstatsoft +2025-07-02 12:46 PM 1 +Configuration information on JSS replication run +×Close Panel +Participants Edit + + JSS Administrator (jstatsoft) + Douglas Bates (dmbates) + +Messages +Note From + +Our apologies for the delay in preparing a revision but we are having difficulty reproducing the results you obtained when attempting to reproduce ours. + +Your comments did motivate us to explain that we are describing evaluation of the objective. Actual model fitting requires both the evaluation of the objective and a method of optimizing this objective and there are several choices and settings that will affect the optimization process. + +Generally we set the tolerance on the objective itself (labelled -2 logLik in the printed information) so that the first 4 digits after the decimal point are reliable. This objective is on the scale of the deviance. As differences in deviance in an analysis of deviance are compared to quantiles of a chi-squared distribution, changes beyond the fourth decimal place would be generally considered negligible. The converged objective in your results ends in .6017 whereas ours ends in .6016. The value at convergence will change with processor type, BLAS implementation, operating system, optimizer implementation, and optimizer algorithm but a typical value is 237648.6016477919 which is close to values that would round to 237648.6017. + +However, we have tried several combinations of architectures (x86_64, Apple M-series), operating systems (Linux, MacOS, Windows), BLAS implementations (OpenBLAS, AppleAccelerate, MKL) using our own machines and on github continuous integration facilities but we haven't been able to duplicate your results. Could you please provide us with the output of + +using InteractiveUtils; versioninfo() + +using Pkg; Pkg.status() + +using LinearAlgebra; BLAS.get_config() + +on the machine and Julia version where you ran our script? If you prefer I can produce a new script including this information for you to run. + dmbates +2025-06-17 04:44 PM + +Julia Version 1.11.5 +Commit 760b2e5b739 (2025-04-14 06:53 UTC) +Build Info: +Official https://julialang.org/ release +Platform Info: +OS: Linux (x86_64-linux-gnu) +CPU: 4 × Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz +WORD_SIZE: 64 +LLVM: libLLVM-16.0.6 (ORCJIT, skylake-avx512) +Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores) +Status `~/5816/jss-replication/Project.toml` +⌅ [cbdf2221] AlgebraOfGraphics v0.10.8 +[13e28ba4] AppleAccelerate v0.4.1 +[69666777] Arrow v2.8.0 +[336ed68f] CSV v0.10.15 +⌅ [13f3f980] CairoMakie v0.13.10 +[0ca39b1e] Chairmarks v1.3.1 +[a93c6f00] DataFrames v1.7.0 +[33e6dc65] MKL v0.9.0 +[ff71e718] MixedModels v4.35.2 +[08abe8d2] PrettyTables v2.4.0 +[6f49c342] RCall v0.14.8 +[856f044c] MKL_jll v2025.0.1+1 +[de0858da] Printf v1.11.0 +[2f01184e] SparseArrays v1.11.0 +Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated` + +Attached the output I obtained as well as the information as written out by Julia. +code.txt info.txt jstatsoft +2025-07-02 12:46 PM +Add Message diff --git a/blockedcholesky/fits.txt b/blockedcholesky/fits.txt new file mode 100644 index 0000000..dc87062 --- /dev/null +++ b/blockedcholesky/fits.txt @@ -0,0 +1,91 @@ +julia> m1 = fit(MixedModel, form, dat; progress) # x86_64, Linux, OpenBLAS +Linear mixed model fit by maximum likelihood + y ~ 1 + service + (1 | s) + (1 | d) + (1 | dept) + (0 + service | dept) + logLik -2 logLik AIC AICc BIC + -118824.3008 237648.6016 237662.6016 237662.6032 237727.0294 + +Variance components: + Column Variance Std.Dev. Corr. +s (Intercept) 0.1052972 0.3244953 +d (Intercept) 0.2624240 0.5122733 +dept (Intercept) 0.0025799 0.0507923 + service: Y 0.0233961 0.1529578 . +Residual 1.3850085 1.1768638 + Number of obs: 73421; levels of grouping factors: 2972, 1128, 14 + + Fixed-effects parameters: +───────────────────────────────────────────────────── + Coef. Std. Error z Pr(>|z|) +───────────────────────────────────────────────────── +(Intercept) 3.27765 0.023503 139.46 <1e-99 +service: Y -0.0507441 0.0439857 -1.15 0.2486 +───────────────────────────────────────────────────── + +julia> m1 = fit(MixedModel, form, dat; progress) # x86_64, Linux, MKL +Linear mixed model fit by maximum likelihood + y ~ 1 + service + (1 | s) + (1 | d) + (1 | dept) + (0 + service | dept) + logLik -2 logLik AIC AICc BIC + -118824.3008 237648.6016 237662.6016 237662.6032 237727.0294 + +Variance components: + Column Variance Std.Dev. Corr. +s (Intercept) 0.1052957 0.3244930 +d (Intercept) 0.2624288 0.5122781 +dept (Intercept) 0.0025802 0.0507954 + service: Y 0.0233988 0.1529668 . +Residual 1.3850086 1.1768639 + Number of obs: 73421; levels of grouping factors: 2972, 1128, 14 + + Fixed-effects parameters: +───────────────────────────────────────────────────── + Coef. Std. Error z Pr(>|z|) +───────────────────────────────────────────────────── +(Intercept) 3.27765 0.0235036 139.45 <1e-99 +service: Y -0.0507433 0.043988 -1.15 0.2487 +───────────────────────────────────────────────────── + +julia> m1 = fit(MixedModel, form, dat; progress) # Apple M-series, MacOS, OpenBLAS +Linear mixed model fit by maximum likelihood + y ~ 1 + service + (1 | s) + (1 | d) + (1 | dept) + (0 + service | dept) + logLik -2 logLik AIC AICc BIC + -118824.3008 237648.6016 237662.6016 237662.6032 237727.0294 + +Variance components: + Column Variance Std.Dev. Corr. +s (Intercept) 0.1052958 0.3244931 +d (Intercept) 0.2624299 0.5122791 +dept (Intercept) 0.0025802 0.0507962 + service: Y 0.0233976 0.1529628 . +Residual 1.3850085 1.1768639 + Number of obs: 73421; levels of grouping factors: 2972, 1128, 14 + + Fixed-effects parameters: +───────────────────────────────────────────────────── + Coef. Std. Error z Pr(>|z|) +───────────────────────────────────────────────────── +(Intercept) 3.27765 0.0235037 139.45 <1e-99 +service: Y -0.0507436 0.043987 -1.15 0.2487 +───────────────────────────────────────────────────── + +julia> m1 = fit(MixedModel, form, dat; progress) # Apple M-series, MacOS, AppleAccelerate +Linear mixed model fit by maximum likelihood + y ~ 1 + service + (1 | s) + (1 | d) + (1 | dept) + (0 + service | dept) + logLik -2 logLik AIC AICc BIC + -118824.3008 237648.6016 237662.6016 237662.6032 237727.0294 + +Variance components: + Column Variance Std.Dev. Corr. +s (Intercept) 0.1052957 0.3244930 +d (Intercept) 0.2624294 0.5122786 +dept (Intercept) 0.0025805 0.0507985 + service: Y 0.0234003 0.1529715 . +Residual 1.3850085 1.1768639 + Number of obs: 73421; levels of grouping factors: 2972, 1128, 14 + + Fixed-effects parameters: +──────────────────────────────────────────────────── + Coef. Std. Error z Pr(>|z|) +──────────────────────────────────────────────────── +(Intercept) 3.27765 0.0235041 139.45 <1e-99 +service: Y -0.050743 0.0439892 -1.15 0.2487 +──────────────────────────────────────────────────── \ No newline at end of file diff --git a/blockedcholesky/replication/ev_appleM.txt b/blockedcholesky/replication/ev_appleM.txt new file mode 100644 index 0000000..a5c1e59 --- /dev/null +++ b/blockedcholesky/replication/ev_appleM.txt @@ -0,0 +1,115 @@ +Julia Version 1.11.5 +Commit 760b2e5b739 (2025-04-14 06:53 UTC) +Build Info: + Official https://julialang.org/ release +Platform Info: + OS: macOS (arm64-apple-darwin24.0.0) + CPU: 8 × Apple M1 Pro + WORD_SIZE: 64 + LLVM: libLLVM-16.0.6 (ORCJIT, apple-m1) +Threads: 6 default, 0 interactive, 3 GC (on 6 virtual cores) +Status `~/git/Manuscripts/blockedcholesky/Project.toml` + [cbdf2221] AlgebraOfGraphics v0.10.6 + [13e28ba4] AppleAccelerate v0.4.0 + [69666777] Arrow v2.8.0 + [336ed68f] CSV v0.10.15 + [13f3f980] CairoMakie v0.13.7 + [0ca39b1e] Chairmarks v1.3.1 + [a93c6f00] DataFrames v1.7.0 + [33e6dc65] MKL v0.8.0 + [ff71e718] MixedModels v4.35.0 + [0a7d04aa] PRIMA v0.2.2 + [08abe8d2] PrettyTables v2.4.0 + [856f044c] MKL_jll v2025.0.1+1 + [de0858da] Printf v1.11.0 + [2f01184e] SparseArrays v1.11.0 +nothing + +LBTConfig([ILP64] libopenblas64_.dylib) + +Initial parameter vector: [1.0, 1.0, 1.0, 1.0] +Initial objective value: 242059.14387186497 + +Backend: nlopt +Optimizer: LN_BOBYQA +Lower bounds: [0.0, 0.0, 0.0, 0.0] +ftol_rel: 1.0e-12 +ftol_abs: 1.0e-8 +xtol_rel: 0.0 +xtol_abs: [1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10] +initial_step: [0.75, 0.75, 0.75, 0.75] +maxfeval: -1 +maxtime: -1.0 + +Function evaluations: 181 +xtol_zero_abs: 0.001 +ftol_zero_abs: 1.0e-5 +Final parameter vector: [0.27572694783915613, 0.4352917263395991, 0.04316230740526005, 0.1299749675679518] +Final objective value: 237648.60164780272 +Return code: FTOL_REACHED + + +objective(updateL!(setθ!(m, thetaOB1))) = 237648.60164780272 + +Sample(time=0.0071772500000000005, allocs=58, bytes=1504) +Initial parameter vector: [1.0, 1.0, 1.0, 1.0] +Initial objective value: 242059.14387186497 + +Backend: prima +Optimizer: bobyqa +Lower bounds: [0.0, 0.0, 0.0, 0.0] +rhobeg: 1.0 +rhoend: 1.0e-6 +maxfeval: -1 + +Function evaluations: 76 +xtol_zero_abs: 0.001 +ftol_zero_abs: 1.0e-5 +Final parameter vector: [0.2757270024730217, 0.43529107377028553, 0.04316118531994337, 0.12997646652807276] +Final objective value: 237648.60164779186 +Return code: SMALL_TR_RADIUS + +LBTConfig([LP64] Accelerate, [ILP64] Accelerate) + +Initial parameter vector: [1.0, 1.0, 1.0, 1.0] +Initial objective value: 242059.14387186497 + +Backend: nlopt +Optimizer: LN_BOBYQA +Lower bounds: [0.0, 0.0, 0.0, 0.0] +ftol_rel: 1.0e-12 +ftol_abs: 1.0e-8 +xtol_rel: 0.0 +xtol_abs: [1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10] +initial_step: [0.75, 0.75, 0.75, 0.75] +maxfeval: -1 +maxtime: -1.0 + +Function evaluations: 173 +xtol_zero_abs: 0.001 +ftol_zero_abs: 1.0e-5 +Final parameter vector: [0.27572689108074344, 0.4352913075607651, 0.04316427749556987, 0.12998231312662356] +Final objective value: 237648.60164785833 +Return code: FTOL_REACHED + + +objective(updateL!(setθ!(m, thetaOB1))) = 237648.60164780298 + +Sample(time=0.005480125000000001, allocs=58, bytes=1504) +Initial parameter vector: [1.0, 1.0, 1.0, 1.0] +Initial objective value: 242059.14387186497 + +Backend: prima +Optimizer: bobyqa +Lower bounds: [0.0, 0.0, 0.0, 0.0] +rhobeg: 1.0 +rhoend: 1.0e-6 +maxfeval: -1 + +Function evaluations: 77 +xtol_zero_abs: 0.001 +ftol_zero_abs: 1.0e-5 +Final parameter vector: [0.27572697189002504, 0.4352910926159146, 0.043161123650275815, 0.12997717957171298] +Final objective value: 237648.60164779192 +Return code: SMALL_TR_RADIUS + diff --git a/blockedcholesky/replication/ev_linux_amd64.txt b/blockedcholesky/replication/ev_linux_amd64.txt new file mode 100755 index 0000000..f946ee9 --- /dev/null +++ b/blockedcholesky/replication/ev_linux_amd64.txt @@ -0,0 +1,80 @@ +Julia Version 1.11.5 +Commit 760b2e5b739 (2025-04-14 06:53 UTC) +Build Info: + Official https://julialang.org/ release +Platform Info: + OS: Linux (x86_64-linux-gnu) + CPU: 8 × 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz + WORD_SIZE: 64 + LLVM: libLLVM-16.0.6 (ORCJIT, tigerlake) +Threads: 8 default, 0 interactive, 4 GC (on 8 virtual cores) +Status `~/git/Manuscripts/blockedcholesky/replication/Project.toml` + [cbdf2221] AlgebraOfGraphics v0.10.6 + [13e28ba4] AppleAccelerate v0.4.0 + [69666777] Arrow v2.8.0 + [336ed68f] CSV v0.10.15 + [13f3f980] CairoMakie v0.13.7 + [0ca39b1e] Chairmarks v1.3.1 + [a93c6f00] DataFrames v1.7.0 + [33e6dc65] MKL v0.8.0 + [ff71e718] MixedModels v4.35.0 + [08abe8d2] PrettyTables v2.4.0 + [856f044c] MKL_jll v2025.0.1+1 + [de0858da] Printf v1.11.0 + [2f01184e] SparseArrays v1.11.0 +nothing + +LBTConfig([ILP64] libopenblas64_.so) + +Initial parameter vector: [1.0, 1.0, 1.0, 1.0] +Initial objective value: 242059.1438718649 + +Backend: nlopt +Optimizer: LN_BOBYQA +Lower bounds: [0.0, 0.0, 0.0, 0.0] +ftol_rel: 1.0e-12 +ftol_abs: 1.0e-8 +xtol_rel: 0.0 +xtol_abs: [1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10] +initial_step: [0.75, 0.75, 0.75, 0.75] +maxfeval: -1 +maxtime: -1.0 + +Function evaluations: 162 +xtol_zero_abs: 0.001 +ftol_zero_abs: 1.0e-5 +Final parameter vector: [0.2757287809406885, 0.43528593517628916, 0.04315922882372803, 0.12997575901521258] +Final objective value: 237648.60164812428 +Return code: FTOL_REACHED + + +objective(updateL!(setθ!(m, thetaOB1))) = 237648.6016478029 + +Sample(time=0.012008102000000001, allocs=58, bytes=1504) +LBTConfig([ILP64] libmkl_rt.so, [LP64] libmkl_rt.so) + +Initial parameter vector: [1.0, 1.0, 1.0, 1.0] +Initial objective value: 242059.14387186477 + +Backend: nlopt +Optimizer: LN_BOBYQA +Lower bounds: [0.0, 0.0, 0.0, 0.0] +ftol_rel: 1.0e-12 +ftol_abs: 1.0e-8 +xtol_rel: 0.0 +xtol_abs: [1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10] +initial_step: [0.75, 0.75, 0.75, 0.75] +maxfeval: -1 +maxtime: -1.0 + +Function evaluations: 151 +xtol_zero_abs: 0.001 +ftol_zero_abs: 1.0e-5 +Final parameter vector: [0.27572075999408585, 0.43529059514658697, 0.043166525487089165, 0.12972357349206423] +Final objective value: 237648.60172719814 +Return code: FTOL_REACHED + + +objective(updateL!(setθ!(m, thetaOB1))) = 237648.6016478031 + +Sample(time=0.007513658, allocs=58, bytes=1504) diff --git a/blockedcholesky/replication/evaluation.jl b/blockedcholesky/replication/evaluation.jl new file mode 100644 index 0000000..2f6bace --- /dev/null +++ b/blockedcholesky/replication/evaluation.jl @@ -0,0 +1,28 @@ +using MKL_jll +using MixedModels # want this to be after PRIMA +import CSV + +const dat = MixedModels.dataset(:insteval); +const form = @formula(y ~ 1 + service + (1|s) + (1|d) + (1|dept) + (0 + service|dept)); +const m1 = LinearMixedModel(form, dat) +@assert m1.optsum.initial == ones(4) + +const progress = false +pltfrmtags = Base.BinaryPlatforms.HostPlatform().tags +const arch = pltfrmtags["arch"] +const os = pltfrmtags["os"] +obj(θ::Vector{Float64}) = objective(updateL!(setθ!(m1, θ))) +BLAS = "OpenBLAS" +tbl = [(; os, arch, BLAS, initial=obj(ones(4)))] + +# install accelerated BLAS +@static if Sys.isapple() && arch == "aarch64" + using AppleAccelerate + BLAS = "AppleAccelerate" +elseif MKL_jll.is_available() + using MKL + BLAS = "MKL" +end + +push!(tbl, (; os, arch, BLAS, initial=obj(ones(4)))) +CSV.write("../data/evaluation.csv", tbl) \ No newline at end of file