Skip to content

Commit d795787

Browse files
committed
Fixes several bugs with example scripts and notebooks.
Adds random sampling to acq maximization. This allows for both brute force and clever optimizations to work together, and is more robust in situations where the acquisition function is very flat with a hard to find maximum.
1 parent c52a0ac commit d795787

File tree

9 files changed

+280
-244
lines changed

9 files changed

+280
-244
lines changed

README.md

Lines changed: 28 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,51 @@
11
# Bayesian Optimization
22

3-
Pure Python implementation of bayesian global optimization with gaussian processes.
3+
Pure Python implementation of bayesian global optimization with gaussian
4+
processes.
45

56
pip install git+https://github.com/fmfn/BayesianOptimization.git
67

7-
This is a constrained global optimization package built upon bayesian inference and gaussian process, that attempts to find the maximum value of an unknown function in as few iterations as possible. This technique is particularly suited for optimization of high cost functions, situations where the balance between exploration and exploitation is important.
8+
This is a constrained global optimization package built upon bayesian inference
9+
and gaussian process, that attempts to find the maximum value of an unknown
10+
function in as few iterations as possible. This technique is particularly
11+
suited for optimization of high cost functions, situations where the balance
12+
between exploration and exploitation is important.
13+
14+
To get a grip of how this method and package works in the [examples](https://github.com/fmfn/BayesianOptimization/tree/master/examples)
15+
folder you can:
16+
- Checkout this [notebook](https://github.com/fmfn/BayesianOptimization/blob/master/examples/visualization.ipynb)
17+
with a step by step visualization of how this method works.
18+
- Go over this [script](https://github.com/fmfn/BayesianOptimization/blob/master/examples/usage.py)
19+
to become familiar with this packages basic functionalities.
20+
- Explore this [notebook](https://github.com/fmfn/BayesianOptimization/blob/master/examples/exploitation%20vs%20exploration.ipynb)
21+
exemplifying the balance between exploration and exploitation and how to
22+
control it.
23+
- Checkout these scripts ([sklearn](https://github.com/fmfn/BayesianOptimization/blob/master/examples/sklearn_example.py),
24+
[xgboost](https://github.com/fmfn/BayesianOptimization/blob/master/examples/xgboost_example.py))
25+
for examples of how to use this package to tune parameters of ML estimators
26+
using cross validation and bayesian optimization
827

9-
Checkout this [notebook](https://github.com/fmfn/BayesianOptimization/blob/master/examples/visualization.ipynb) with a step by step visualization of how this method works.
1028

1129
![BayesianOptimization in action](https://github.com/fmfn/BayesianOptimization/blob/master/examples/bo_example.png)
1230

13-
Checkout the [examples](https://github.com/fmfn/BayesianOptimization/tree/master/examples) folder for more scripts with examples of how to use this package.
14-
1531
![BayesianOptimization in action](https://github.com/fmfn/BayesianOptimization/blob/master/examples/bayesian_optimization.gif)
1632

33+
34+
This project is under active development, if you find a bug, or anything that
35+
needs correction, please let me know.
36+
1737
Installation
1838
============
1939

2040
### Installation
2141

22-
BayesianOptimization is not currently available on the PyPi's reporitories,
42+
BayesianOptimization is not currently available on the PyPi's reporitories,
2343
however you can install it via `pip`:
2444

2545
pip install git+https://github.com/fmfn/BayesianOptimization.git
2646

27-
If you prefer, you can clone it and run the setup.py file. Use the following commands to get a
28-
copy from Github and install all dependencies:
47+
If you prefer, you can clone it and run the setup.py file. Use the following
48+
commands to get a copy from Github and install all dependencies:
2949

3050
git clone https://github.com/fmfn/BayesianOptimization.git
3151
cd BayesianOptimization
@@ -36,8 +56,6 @@ copy from Github and install all dependencies:
3656
* Scipy
3757
* Scikit-learn
3858

39-
Disclaimer: This project is under active development, if you find a bug, or anything that needs correction, please let me know.
40-
4159
### References:
4260
* http://papers.nips.cc/paper/4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf
4361
* http://arxiv.org/pdf/1012.2599v1.pdf

bayes_opt/bayesian_optimization.py

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@
55
from sklearn.gaussian_process import GaussianProcessRegressor
66
from sklearn.gaussian_process.kernels import Matern
77
from .helpers import UtilityFunction, unique_rows, PrintLog, acq_max
8-
__author__ = 'fmfn'
98

109

1110
class BayesianOptimization(object):
@@ -88,7 +87,8 @@ def init(self, init_points):
8887
"""
8988

9089
# Generate random points
91-
l = [np.random.uniform(x[0], x[1], size=init_points) for x in self.bounds]
90+
l = [np.random.uniform(x[0], x[1], size=init_points)
91+
for x in self.bounds]
9292

9393
# Concatenate new random points to possible existing
9494
# points from self.explore method.
@@ -172,7 +172,9 @@ def initialize_df(self, points_df):
172172
Method to introduce point for which the target function
173173
value is known from pandas dataframe file
174174
175-
:param points_df: pandas dataframe with columns (target, {list of columns matching self.keys})
175+
:param points_df:
176+
pandas dataframe with columns (target, {list of columns matching
177+
self.keys})
176178
177179
ex:
178180
target alpha colsample_bytree gamma
@@ -333,7 +335,8 @@ def points_to_csv(self, file_name):
333335
After training all points for which we know target variable
334336
(both from initialization and optimization) are saved
335337
336-
:param file_name: name of the file where points will be saved in the csv format
338+
:param file_name: name of the file where points will be saved in the csv
339+
format
337340
338341
:return: None
339342
"""

bayes_opt/helpers.py

Lines changed: 25 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,11 @@
88

99
def acq_max(ac, gp, y_max, bounds):
1010
"""
11-
A function to find the maximum of the acquisition function using
12-
the 'L-BFGS-B' method.
11+
A function to find the maximum of the acquisition function
12+
13+
It uses a combination of random sampling (cheap) and the 'L-BFGS-B'
14+
optimization method. First by sampling 1e5 points at random, and then
15+
running L-BFGS-B from 250 random starting points.
1316
1417
Parameters
1518
----------
@@ -31,24 +34,27 @@ def acq_max(ac, gp, y_max, bounds):
3134
:return: x_max, The arg max of the acquisition function.
3235
"""
3336

34-
# Start with the lower bound as the argmax
35-
x_max = bounds[:, 0]
36-
max_acq = None
37-
37+
# Warm up with random points
3838
x_tries = np.random.uniform(bounds[:, 0], bounds[:, 1],
39-
size=(100, bounds.shape[0]))
40-
41-
for x_try in x_tries:
39+
size=(100000, bounds.shape[0]))
40+
ys = ac(x_tries, gp=gp, y_max=y_max)
41+
x_max = x_tries[ys.argmax()]
42+
max_acq = ys.max()
43+
44+
# Explore the parameter space more throughly
45+
x_seeds = np.random.uniform(bounds[:, 0], bounds[:, 1],
46+
size=(250, bounds.shape[0]))
47+
for x_try in x_seeds:
4248
# Find the minimum of minus the acquisition function
4349
res = minimize(lambda x: -ac(x.reshape(1, -1), gp=gp, y_max=y_max),
4450
x_try.reshape(1, -1),
4551
bounds=bounds,
4652
method="L-BFGS-B")
4753

4854
# Store it if better than previous minimum(maximum).
49-
if max_acq is None or -res.fun >= max_acq:
55+
if max_acq is None or -res.fun[0] >= max_acq:
5056
x_max = res.x
51-
max_acq = -res.fun
57+
max_acq = -res.fun[0]
5258

5359
# Clip output to make sure it lies within the bounds. Due to floating
5460
# point technicalities this is not always the case.
@@ -165,7 +171,8 @@ def print_header(self, initialization=True):
165171
print("{}Bayesian Optimization{}".format(BColours.RED,
166172
BColours.ENDC))
167173

168-
print(BColours.BLUE + "-" * (29 + sum([s + 5 for s in self.sizes])) + BColours.ENDC)
174+
print(BColours.BLUE + "-" * (29 + sum([s + 5 for s in self.sizes])) +
175+
BColours.ENDC)
169176

170177
print("{0:>{1}}".format("Step", 5), end=" | ")
171178
print("{0:>{1}}".format("Time", 6), end=" | ")
@@ -193,10 +200,12 @@ def print_step(self, x, y, warning=False):
193200
end=" | ")
194201

195202
for index in self.sorti:
196-
print("{0}{2: >{3}.{4}f}{1}".format(BColours.GREEN, BColours.ENDC,
197-
x[index],
198-
self.sizes[index] + 2,
199-
min(self.sizes[index] - 3, 6 - 2)),
203+
print("{0}{2: >{3}.{4}f}{1}".format(
204+
BColours.GREEN, BColours.ENDC,
205+
x[index],
206+
self.sizes[index] + 2,
207+
min(self.sizes[index] - 3, 6 - 2)
208+
),
200209
end=" | ")
201210
else:
202211
print("{: >10.5f}".format(y), end=" | ")

0 commit comments

Comments
 (0)