-
Notifications
You must be signed in to change notification settings - Fork 7
Description
While trying to control for every factor, it appears difficult to get the same results when directly calling xrf::xrf() and when called via rules::rule_fit().
Example:
library(tidymodels)
library(rules)
#>
#> Attaching package: 'rules'
#> The following object is masked from 'package:dials':
#>
#> max_rules
library(xrf)
set.seed(1)
ex_data <- modeldata::hpc_data |> slice_sample(n = 100, by = class)
set.seed(2)
xrf_fit <- xrf(class ~ ., data = ex_data, family = "multinomial")
#> Warning in xrf(class ~ ., data = ex_data, family = "multinomial"): Detected 4
#> classes to set num_class xgb_control parameter
set.seed(2)
rules_fit <-
rule_fit() |>
set_engine("xrf", seed = 0) |>
set_mode("classification") |>
fit(class ~ ., data = ex_data)
xrf_fit$xgb |> xgboost::xgb.dump() |> head()
#> [1] "booster[0]"
#> [2] "0:[compounds<197] yes=1,no=2,missing=2"
#> [3] "1:[iterations<50] yes=3,no=4,missing=4"
#> [4] "3:[protocolM<2.00001001] yes=7,no=8,missing=8"
#> [5] "7:leaf=-0.00722891558"
#> [6] "8:leaf=0.352313161"
rules_fit$fit$xgb |> xgboost::xgb.dump() |> head()
#> [1] "booster[0]"
#> [2] "0:[compounds<197] yes=1,no=2,missing=2"
#> [3] "1:[iterations<50] yes=3,no=4,missing=4"
#> [4] "3:[protocolM<1] yes=7,no=8,missing=8"
#> [5] "7:[protocolH<1] yes=13,no=14,missing=14"
#> [6] "13:[protocolO<1] yes=23,no=24,missing=24"Created on 2026-01-13 with reprex v2.1.1
It turns out that one possible issue is the objective function argument. xrf, for multinomial data uses a value of "multi:softmax" while tidymodels uses "multi:softprob".
For xrf::xrf(), it sets that to objective = "multi:softmax" in its internal xrf::get_xgboost_objective() function. There is no way to pass a different value.
For rules, it passes things off to parsnip::xgb_train() but explicitly sets objective = NULL so there is no way for the user to reset the objective function. For multinomial data in tidymodels, it automatically sets objective = "multi:softprob".
The use of parsnip::xgb_fit() (instead of just passing everything to xrf::xrf()) came about in #60.
One possible short-term solution is to enable passing the objective function to parsnip::xgb_train(). We'll need to ensure that it doesn't harm the early stopping feature implemented in #60.
We also might want to enable xrf to be able to modify the objective function or change the default (since likelihood gradient boosting is the norm for these models).