Skip to content

Commit affb52a

Browse files
Edit Vignettes
1 parent 56a0df9 commit affb52a

File tree

7 files changed

+49
-47
lines changed

7 files changed

+49
-47
lines changed

R/BayesianOptimization.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,7 @@ BayesianOptimization <- function(
191191
if (!initialize & nrow(leftOff) == 0) stop("initialize cannot be FALSE if leftOff is not provided. Set initialize to TRUE and provide either initGrid or initPoints. You can provide leftOff AND initialize if you want.\n")
192192
if (initialize & nrow(initGrid) == 0 & initPoints <= 0) stop("initialize is TRUE but neither initGrid or initPoints were provided")
193193
if (initPoints > 0 & nrow(initGrid)>0) stop("initGrid and initPoints are specified, choose one.")
194-
if (initPoints <= 0 & nrow(initGrid)==0) stop("neither initGrid or initPoints are specified, choose one or provide leftOff")
194+
if (initPoints <= 0 & nrow(initGrid)==0 & nrow(leftOff) == 0) stop("neither initGrid or initPoints are specified, choose one or provide leftOff")
195195
if (parallel & (Workers == 1)) stop("parallel is set to TRUE but no back end is registered.\n")
196196
if (!parallel & Workers > 1 & verbose > 0) cat("parallel back end is registered, but parallel is set to false. Process will not be run in parallel.\n")
197197
if (nrow(initGrid)>0) {

R/applyCluster.R

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,11 +55,20 @@ applyCluster <- function(e = parent.frame()) {
5555
, scaled = TRUE
5656
)
5757

58+
# Named vectors cannot be directly coerced to data.table
59+
if (e$runNew-newPoints < 2) {
60+
noisyP <- as.data.table(as.list(noisyP))
61+
} else{
62+
noisyP <- data.table(noisyP)
63+
}
64+
5865
if (nrow(fintersect(ScaleDT,data.table(noisyP))) == 0) {
5966
newSet <- rbind(clusterPoints[,(drop) := NULL],noisyP)
6067
break
6168
}
6269

70+
unique(rbind(ScaleDT,noisyP))
71+
6372
tries <- tries + 1
6473

6574
}

man/BayesianOptimization.Rd

Lines changed: 4 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

vignettes/Distributions.png

6.65 KB
Loading

vignettes/Optimums.png

8.36 KB
Loading

vignettes/advancedFeatures.Rmd

Lines changed: 13 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ scoringFunction <- function(max_depth, min_child_weight, subsample) {
5858
}
5959
6060
bounds <- list( max_depth = c(2L, 10L)
61-
, min_child_weight = c(1L, 100L)
61+
, min_child_weight = c(1, 100)
6262
, subsample = c(0.25, 1))
6363
6464
kern <- "Matern52"
@@ -131,33 +131,25 @@ When a leftOff table is provided, depending on how the experiment is set up, one
131131
Keep in mind, if you change your bounds, you will need to delete any rows from your leftOff table that fall outside the bounds.
132132

133133
********
134-
### Adjusting the noiseAdd parameter
134+
### Adjusting noiseAdd and minClusterUtility
135135

136-
Once we have extracted the next expected optimal parameter set from the Gaussian process, we have several decisions to make. We can run 1 new scoringFunction at the new parameter, or we can run the scoring function n times in parallel at n different parameter sets. If we run several scoringFunctions in parallel, we need to decide where the other n-1 parameter sets come from. For the sake of decreasing uncertainty around the estimated optimal parameter, this process pulls the other n-1 parameter sets from a shape(4,4) beta distribution centered at the estimated optimal parameter.
136+
If we want to run n scoring functions in parallel, an important decision we need to make is how to choose the next n parameter candidate sets. One logical choice is the parameter set which maximizes our acquisition function. However, we still need to decide on the other n-1 sets. There are two good choices:
137137

138-
As an example, let's say our min_child_weight is bounded between [0,10] and the Gaussian process says that our acquisition function is maximized at min_child_weight = 6. We can control how the process randomly samples around this point by using the noiseAdd parameter, which tells the process the percentage of the range specified by ```bounds``` to sample:
138+
1. Add noise to the global optimum to sample nearby points.
139+
2. Determine if there are other local optimums which may be nearly as good
140+
141+
This package allows you to do both. Using the ```minClusterUtility``` parameter, you can specify the minimum percentage utility of the global optimum required for a different local optimum to be considered. As an example, let's say we are optimizing 1 hyperparameter ```min_child_weight```, which is bounded between [0,10]. Our acquisition function may look like the following:
139142

140143
```{r eval = TRUE, echo=FALSE}
141-
library(ggplot2)
142-
143-
y1 <- function(x) {
144-
(x-7)^3*(5-(x))^3/15
145-
}
144+
knitr::include_graphics("Optimums.png")
145+
```
146146

147-
y2 <- function(x) {
148-
(x-8)^3*(4-(x))^3/1700
149-
}
147+
In this case, we may want to run our scoring function on both the global and local maximum. If ```minClusterUtility``` is set to be at least 1.83/2.14 ~ 0.855, the process would use both the local and global maximums as candidate parameter sets in the next round.
150148

151-
ggplot(data.frame(x=c(0,10)), aes(x)) +
152-
stat_function(fun = y1, geom = "line", aes(colour = "red"), xlim = c(5,7)) +
153-
stat_function(fun = y2, geom = "line", aes(colour = "blue"), xlim = c(4,8)) +
154-
scale_x_continuous(name = "min_child_weight", breaks = seq(0,10,1), limits = c(0,10)) +
155-
scale_y_continuous(limits=c(0,0.075)) +
156-
scale_color_discrete(name = "noiseAdd", labels = c("0.4", "0.2")) +
157-
ylab("Density") +
158-
ggtitle("Distributions Sampled for Different noiseAdd Values") +
159-
theme(plot.margin=unit(c(0.5,1,0.5,0.5),"cm"))
149+
However, this doesn't fully solve our problem. In the example above, we had 2 local maximums, but what if we want to run 10 instances of our scoring function in parallel? We would need to come up with 8 other sets of parameters. For the sake of decreasing uncertainty around the most promising parameter sets, this process samples from a shape(4,4) beta distribution centered at the estimated optimal parameters. In the example above, our acquisition function was maximized at ```min_child_weight = 4```. The figure below shows the effect that adjusting the ```noiseAdd``` parameter has on how we draw the other 8 candidate parameter sets:
160150

151+
```{r eval = TRUE, echo=FALSE}
152+
knitr::include_graphics("Distributions.png")
161153
```
162154

163155

vignettes/standardFeatures.Rmd

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -40,9 +40,9 @@ library("ParBayesianOptimization")
4040
4141
data(agaricus.train, package = "xgboost")
4242
43-
Folds <- list( Fold1 = as.integer(seq(1,nrow(agaricus.train$data),by = 3))
44-
, Fold2 = as.integer(seq(2,nrow(agaricus.train$data),by = 3))
45-
, Fold3 = as.integer(seq(3,nrow(agaricus.train$data),by = 3)))
43+
Folds <- list(Fold1 = as.integer(seq(1,nrow(agaricus.train$data),by = 3))
44+
, Fold2 = as.integer(seq(2,nrow(agaricus.train$data),by = 3))
45+
, Fold3 = as.integer(seq(3,nrow(agaricus.train$data),by = 3)))
4646
```
4747

4848
Now we need to define the scoring function. This function should, at a minimum, return a list with a ```Score``` element, which is the model evaluation metric we want to maximize. We can also retain other pieces of information created by the scoring function by including them as named elements of the returned list. In this case, we want to retain the optimal number of rounds determined by the ```xgb.cv```:
@@ -53,24 +53,24 @@ scoringFunction <- function(max_depth, min_child_weight, subsample) {
5353
dtrain <- xgb.DMatrix(agaricus.train$data,label = agaricus.train$label)
5454
5555
Pars <- list( booster = "gbtree"
56-
, eta = 0.01
57-
, max_depth = max_depth
58-
, min_child_weight = min_child_weight
59-
, subsample = subsample
60-
, objective = "binary:logistic"
61-
, eval_metric = "auc")
62-
63-
xgbcv <- xgb.cv(params = Pars,
64-
data = dtrain
65-
, nround = 100
66-
, folds = Folds
67-
, prediction = TRUE
68-
, showsd = TRUE
69-
, early_stopping_rounds = 5
70-
, maximize = TRUE
71-
, verbose = 0)
72-
73-
return(list(Score = max(xgbcv$evaluation_log$test_auc_mean)
56+
, eta = 0.01
57+
, max_depth = max_depth
58+
, min_child_weight = min_child_weight
59+
, subsample = subsample
60+
, objective = "binary:logistic"
61+
, eval_metric = "auc")
62+
63+
xgbcv <- xgb.cv(params = Pars
64+
, data = dtrain
65+
, nround = 100
66+
, folds = Folds
67+
, prediction = TRUE
68+
, showsd = TRUE
69+
, early_stopping_rounds = 5
70+
, maximize = TRUE
71+
, verbose = 0)
72+
73+
return(list( Score = max(xgbcv$evaluation_log$test_auc_mean)
7474
, nrounds = xgbcv$best_iteration
7575
)
7676
)
@@ -86,7 +86,7 @@ Some other objects we need to define are the bounds, GP kernel and acquisition f
8686

8787
```{r eval = TRUE}
8888
bounds <- list( max_depth = c(2L, 10L)
89-
, min_child_weight = c(1L, 100L)
89+
, min_child_weight = c(1, 100)
9090
, subsample = c(0.25, 1))
9191
9292
kern <- "Matern52"

0 commit comments

Comments
 (0)