-
Notifications
You must be signed in to change notification settings - Fork 11
Description
A great feature of the tune_bayes()
and tune_sim_anneal()
is the initial
argument that can be the output from a previous tune_grid()
call.
Unfortunately, this feature becomes quite cumbersome for workflow_set
objects? This stack post shows that passing workflow_map(initial =tune_grid)
won't work, and the suggested answer is a rather verbose and manual recursive call:
workflow_set() %>%
## add for one model
option_add(
id = "recipe_lasso",
initial = tune_grid_res %>% extract_workflow_set_result("recipe_lasso")
) %>%
## repeat for every model....
option_add(..) %>%
option_add(..) %>%
workflow_map()
this is also slightly counter-intuitive, as the option_add()
is added to the initial workflow_set
, not to the workflow_map()
call?
Would it be possible to have an explicit argument initial
in workflow_map()
that would possibly take as input a previous workflow_set
and.... map it to the underlying modelling function?
Thanks!
Below is an example, together with a manual solution function option_add_initial()
which is quite... untidy!
library(tidymodels)
# Load and prepare data
ames_data <- ames[,sapply(ames, class) %in% c("integer", "numeric")]
# Define a recipe
recipe <- recipe(Sale_Price ~ ., data = ames_data) %>%
step_normalize(all_predictors())
# Define models
lasso_model <- linear_reg(penalty = tune(), mixture = 1) %>%
set_engine("glmnet")
rf_model <- rand_forest(min_n = tune(), trees = 100) %>%
set_engine("ranger") %>%
set_mode("regression")
# Create workflows
lasso_wf <- workflow() %>%
add_model(lasso_model) %>%
add_recipe(recipe)
rf_wf <- workflow() %>%
add_model(rf_model) %>%
add_recipe(recipe)
cross_val <- vfold_cv(ames_data, v = 5)
tune_grid_res <-
workflow_set(
preproc = list(recipe),
models = list(lasso = lasso_model, rf = rf_model)) %>%
workflow_map("tune_grid", resamples = cross_val, grid = 10)
option_add_initial <- function(workflow_set, workflow_with_grid_results){
res <- workflow_set
for(i in 1:nrow(workflow_with_grid_results)){
id_i <- workflow_with_grid_results$wflow_id[[i]]
res <- res %>%
option_add(
id = id_i,
initial = workflow_with_grid_results %>% extract_workflow_set_result(id_i)
)
}
res
}
tune_bayes_res <- workflow_set(
preproc = list(recipe),
models = list(lasso = lasso_model, rf = rf_model)
) %>%
option_add_initial(workflow_with_grid_results = tune_grid_res) %>%
workflow_map("tune_bayes", resamples = cross_val)
#> ! All of the rmse values were identical. The Gaussian process model cannot be
#> fit to the data. Try expanding the range of the tuning parameters.
#> → A | error: Infinite values of the Deviance Function,
#> unable to find optimum parameters
#>
#> There were issues with some computations A: x1
#> ✖ Optimization stopped prematurely; returning current results.
#> There were issues with some computations A: x1There were issues with some computations A: x1
tune_bayes_res
#> # A workflow set/tibble: 2 × 4
#> wflow_id info option result
#> <chr> <list> <list> <list>
#> 1 recipe_lasso <tibble [1 × 4]> <opts[2]> <tune[+]>
#> 2 recipe_rf <tibble [1 × 4]> <opts[2]> <tune[+]>
Created on 2025-03-21 with reprex v2.1.1