-
-
Notifications
You must be signed in to change notification settings - Fork 28
Description
There is an inconsistency in how split_group_term()
behaves in case of add_lower_terms = FALSE
(before #532, add_lower_terms
was called add_main_effects
, but I'll refer to the state of #532 here). This inconsistency occurs between a group-level term that has a group-level intercept and a group-level term that does not have a group-level intercept. This is relevant only for project()
because only there, we have add_lower_terms = FALSE
.
Reprex:
devtools::load_all("<path_to_projpred>")
#> ℹ Loading projpred
#> This is projpred version 2.9.0.9000.
split_formula(y ~ (x + z + x:z | g), add_lower_terms = FALSE)
#> [1] "1" "(1 | g)" "(x | g)" "(z | g)" "(x:z | g)"
split_formula(y ~ (0 + x + z + x:z | g), add_lower_terms = FALSE)
#> [1] "1" "x + (0 + x | g)" "z + (0 + z | g)"
#> [4] "x:z + (0 + x:z | g)"
Created on 2025-08-23 with reprex v2.1.1
I would have expected the same output for the first split_formula()
call, but different output for the second split_formula()
call (namely, without the population-level terms x
, z
, and x:z
added):
devtools::load_all("<path_to_projpred>")
#> ℹ Loading projpred
#> This is projpred version 2.9.0.9000.
split_formula(y ~ (x + z + x:z | g), add_lower_terms = FALSE)
#> [1] "1" "(1 | g)" "(x | g)" "(z | g)" "(x:z | g)"
split_formula(y ~ (0 + x + z + x:z | g), add_lower_terms = FALSE)
#> [1] "1" "(0 + x | g)" "(0 + z | g)"
#> [4] "(0 + x:z | g)"
The reason I expect it this way (and not in the way that the first split_formula()
call should add the population-level terms x
, z
, and x:z
) is the way split_group_terms()
behaves in case of group_intercept == TRUE
(object group_intercept
is created inside of split_group_terms()
):
devtools::load_all("<path_to_projpred>")
#> ℹ Loading projpred
#> This is projpred version 2.9.0.9000.
split_formula(y ~ (x + z + x:z | g))
#> [1] "1" "(1 | g)"
#> [3] "x + (x | g)" "z + (z | g)"
#> [5] "x + z + x:z + (x + z + x:z | g)" "x + (1 | g)"
#> [7] "z + (1 | g)" "x + z + x:z + (1 | g)"
split_formula(y ~ (x + z + x:z | g), add_lower_terms = FALSE)
#> [1] "1" "(1 | g)" "(x | g)" "(z | g)" "(x:z | g)"
Created on 2025-08-23 with reprex v2.1.1
In lines https://github.com/fweber144/projpred/blob/c11bac4145ea4700bf54e774463834f6f90ed76d/R/formula.R#L430-L492, one can see the different add_lower_terms
cases in the group_intercept == TRUE
case. Different add_lower_terms
cases are missing in the group_intercept == FALSE
case (lines https://github.com/fweber144/projpred/blob/c11bac4145ea4700bf54e774463834f6f90ed76d/R/formula.R#L493-L508), so I think it was simply forgotten to have different add_lower_terms
cases there. I'll add a PR with a fix how I would imagine it.