Skip to content

Commit 1c51f43

Browse files
committed
Adds favicon, writes the docs for cma-es
1 parent 8cc52bf commit 1c51f43

File tree

7 files changed

+64
-10
lines changed

7 files changed

+64
-10
lines changed

docs/protein-optimization/_config.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Book settings
22
# Learn more at https://jupyterbook.org/customize/config.html
33

4-
title: Protein optimization
4+
title: Documentation for poli and poli-baselines
55
author: Center for Basic Machine Learning Research in Life Science
66
logo: MLLS_concept_color.png
77

@@ -21,6 +21,8 @@ bibtex_bibfiles:
2121

2222
sphinx:
2323
config:
24+
latex_elements:
25+
preamble: "\\usepackage{bm}"
2426
bibtex_reference_style: author_year
2527
autodoc_mock_imports:
2628
- Bio
@@ -62,3 +64,4 @@ repository:
6264
html:
6365
use_issues_button: true
6466
use_repository_button: true
67+
favicon: "favicon.png"

docs/protein-optimization/favicon.png

610 Bytes
Loading

docs/protein-optimization/index.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,11 @@ This page contains documentation on how to use `poli`, a library of discrete obj
55
A core feature of `poli` is isolating calls to complicated objective functions which might, for example, depend on simulators, binaries, and highly specific package requirements.
66
Our promise is: if you can run your objective function reliably in a `conda` environment, then you can register it and call it from other projects and environments without having to worry about re-installing all the dependencies.
77

8+
## Get started!
9+
10+
A good place to start is the next chapter! [Go to Getting Started](./getting_started/getting_started.md).
11+
12+
813
## Black-box objective functions
914

1015
[For a full list, click here](./using_poli/objective_repository/all_objectives.md).
@@ -142,10 +147,6 @@ Learning continuous representations and optimizing in latent space. [WIP]
142147

143148
::::
144149

145-
## Get started!
146-
147-
A good place to start is the next chapter! [Go to Getting Started](./getting_started/getting_started.md).
148-
149150

150151
## Contribute problems or solvers
151152

docs/protein-optimization/references.bib

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ @article{Shahriari:BOReview:2016
105105
@inproceedings{Kirschner:LineBO:2019, title={Adaptive and Safe Bayesian Optimization in High Dimensions via One-Dimensional Subspaces}, ISSN={2640-3498}, url={https://proceedings.mlr.press/v97/kirschner19a.html}, abstractNote={Bayesian optimization is known to be difficult to scale to high dimensions, because the acquisition step requires solving a non-convex optimization problem in the same search space. In order to scale the method and keep its benefits, we propose an algorithm (LineBO) that restricts the problem to a sequence of iteratively chosen one-dimensional sub-problems that can be solved efficiently. We show that our algorithm converges globally and obtains a fast local rate when the function is strongly convex. Further, if the objective has an invariant subspace, our method automatically adapts to the effective dimension without changing the algorithm. When combined with the SafeOpt algorithm to solve the sub-problems, we obtain the first safe Bayesian optimization algorithm with theoretical guarantees applicable in high-dimensional settings. We evaluate our method on multiple synthetic benchmarks, where we obtain competitive performance. Further, we deploy our algorithm to optimize the beam intensity of the Swiss Free Electron Laser with up to 40 parameters while satisfying safe operation constraints.}, booktitle={Proceedings of the 36th International Conference on Machine Learning}, publisher={PMLR}, author={Kirschner, Johannes and Mutny, Mojmir and Hiller, Nicole and Ischebeck, Rasmus and Krause, Andreas}, year={2019}, month=may, pages={3429–3438}, language={en} }
106106

107107

108-
@article{Balandat:botorch:2020, title={BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization}, url={http://arxiv.org/abs/1910.06403}, abstractNote={Bayesian optimization provides sample-efficient global optimization for a broad range of applications, including automatic machine learning, engineering, physics, and experimental design. We introduce BoTorch, a modern programming framework for Bayesian optimization that combines Monte-Carlo (MC) acquisition functions, a novel sample average approximation optimization approach, auto-differentiation, and variance reduction techniques. BoTorch’s modular design facilitates flexible specification and optimization of probabilistic models written in PyTorch, simplifying implementation of new acquisition functions. Our approach is backed by novel theoretical convergence results and made practical by a distinctive algorithmic foundation that leverages fast predictive distributions, hardware acceleration, and deterministic optimization. We also propose a novel “one-shot” formulation of the Knowledge Gradient, enabled by a combination of our theoretical and software contributions. In experiments, we demonstrate the improved sample efficiency of BoTorch relative to other popular libraries.}, note={arXiv:1910.06403 [cs, math, stat]}, number={arXiv:1910.06403}, publisher={arXiv}, author={Balandat, Maximilian and Karrer, Brian and Jiang, Daniel R. and Daulton, Samuel and Letham, Benjamin and Wilson, Andrew Gordon and Bakshy, Eytan}, year={2020}, month=dec }
108+
@article{Balandat:botorch:2020, title={BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization}, url={http://arxiv.org/abs/1910.06403}, abstractNote={Bayesian optimization provides sample-efficient global optimization for a broad range of applications, including automatic machine learning, engineering, physics, and experimental design. We introduce BoTorch, a modern programming framework for Bayesian optimization that combines Monte-Carlo (MC) acquisition functions, a novel sample average approximation optimization approach, auto-differentiation, and variance reduction techniques. BoTorch’s modular design facilitates flexible specification and optimization of probabilistic models written in PyTorch, simplifying implementation of new acquisition functions. Our approach is backed by novel theoretical convergence results and made practical by a distinctive algorithmic foundation that leverages fast predictive distributions, hardware acceleration, and deterministic optimization. We also propose a novel “one-shot” formulation of the Knowledge Gradient, enabled by a combination of our theoretical and software contributions. In experiments, we demonstrate the improved sample efficiency of BoTorch relative to other popular libraries.}, note={arXiv:1910.06403 [cs, math, stat]}, number={arXiv:1910.06403}, publisher={arXiv}, journal={arXiv}, author={Balandat, Maximilian and Karrer, Brian and Jiang, Daniel R. and Daulton, Samuel and Letham, Benjamin and Wilson, Andrew Gordon and Bakshy, Eytan}, year={2020}, month=dec }
109109

110110
@inproceedings{gardner:gpytorch:2018,
111111
title={GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration},
@@ -116,3 +116,12 @@ @inproceedings{gardner:gpytorch:2018
116116

117117
@article{GomezBombarelli:VAEsAndOpt:2018, title={Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules}, volume={4}, ISSN={2374-7943}, DOI={10.1021/acscentsci.7b00572}, abstractNote={We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration and optimization through open-ended spaces of chemical compounds. A deep neural network was trained on hundreds of thousands of existing chemical structures to construct three coupled functions: an encoder, a decoder, and a predictor. The encoder converts the discrete representation of a molecule into a real-valued continuous vector, and the decoder converts these continuous vectors back to discrete molecular representations. The predictor estimates chemical properties from the latent continuous vector representation of the molecule. Continuous representations of molecules allow us to automatically generate novel chemical structures by performing simple operations in the latent space, such as decoding random vectors, perturbing known chemical structures, or interpolating between molecules. Continuous representations also allow the use of powerful gradient-based optimization to efficiently guide the search for optimized functional compounds. We demonstrate our method in the domain of drug-like molecules and also in a set of molecules with fewer that nine heavy atoms.}, number={2}, journal={ACS Central Science}, publisher={American Chemical Society}, author={Gómez-Bombarelli, Rafael and Wei, Jennifer N. and Duvenaud, David and Hernández-Lobato, José Miguel and Sánchez-Lengeling, Benjamín and Sheberla, Dennis and Aguilera-Iparraguirre, Jorge and Hirzel, Timothy D. and Adams, Ryan P. and Aspuru-Guzik, Alán}, year={2018}, month=feb, pages={268–276} }
118118

119+
@INPROCEEDINGS{Hansen:CMA-ES:1996,
120+
author={Hansen, N. and Ostermeier, A.},
121+
booktitle={Proceedings of IEEE International Conference on Evolutionary Computation},
122+
title={Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation},
123+
year={1996},
124+
volume={},
125+
number={},
126+
pages={312-317},
127+
doi={10.1109/ICEC.1996.542381}}

docs/protein-optimization/using_poli_baselines/bayesian_optimization.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,9 @@
44

55
## About
66

7-
Bayesian Optimization is a sample-efficient black box optimization algorithm which uses an uncertainty-aware approximation $\tilde{f}(\bm{x})$ of the objective function $f$. This surrogate model $\tilde{f}$ is usually a Gaussian Process, whose predictions and uncertainties are used to build an _acquisition function_ $\alpha(\bm{x})$. Optimizing $\alpha$ renders points that are _likely_ to perform well for $f$. By smartly including uncertainties in $\alpha$, Bayesian Optimization balances exploration and exploitation.
7+
Bayesian Optimization is a sample-efficient black box optimization algorithm which uses an uncertainty-aware approximation $\tilde{f}(\boldsymbol{x})$ of the objective function $f$. This surrogate model $\tilde{f}$ is usually a Gaussian Process, whose predictions and uncertainties are used to build an _acquisition function_ $\alpha(\boldsymbol{x})$. Optimizing $\alpha$ renders points that are _likely_ to perform well for $f$. By smartly including uncertainties in $\alpha$, Bayesian Optimization balances exploration and exploitation.
88

9-
Our implementation uses `gpytorch` and `botorch` as the engines for Bayesian Optimization {cite+p}`Balandat:botorch:2020,gardner:gpytorch:2018`. We use the default `botorch` single-task Gaussian Process, and we optimize the acquisition function using grid-search for 1 and 2 dimensions, and using `botorch`'s utilities from 3 onwards.
9+
Our implementation uses `gpytorch` and `botorch` as the engines for Bayesian Optimization {cite:p}`Balandat:botorch:2020,gardner:gpytorch:2018`. We use the default `botorch` single-task Gaussian Process, and we optimize the acquisition function using grid-search for 1 and 2 dimensions, and using `botorch`'s utilities from 3 onwards.
1010

1111
## How to run
1212

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,44 @@
11
# CMA-ES
22

3-
[TODO: write]
3+
![Type of optimizer algorithm: continuous inputs](https://img.shields.io/badge/Type-continuous_inputs-cyan)
4+
5+
## About
6+
7+
Covariance Matrix Adaptation - Evolutionary Strategy (CMA-ES) maintains the mean $\boldsymbol{\mu}$ and the covariance $\boldsymbol{\Sigma}$ of a Normal distribution, updating it using a subset of the best-performing members at each iteration {cite:p}`Hansen:CMA-ES:1996`.
8+
9+
For an introduction to evolutionary strategies we recommend [this blogpost by David Ha](https://blog.otoro.net/2017/10/29/visual-evolution-strategies/).
10+
11+
In our implementation, we use `pycma`.
12+
13+
## How to run
14+
15+
```python
16+
from poli_baselines.solvers import CMA_ES
17+
from poli.objective_repository import ToyContinuousProblemFactory
18+
19+
n_dimensions = 3
20+
population_size = 10
21+
22+
problem_factory = ToyContinuousProblemFactory()
23+
24+
f, _, _ = problem_factory.create(
25+
name="toy_continuous_problem",
26+
function_name="ackley_function_01",
27+
n_dimensions=n_dimensions,
28+
)
29+
30+
x0 = np.random.normal(size=(population_size, n_dimensions))
31+
y0 = f(x0)
32+
33+
initial_mean = np.random.normal(size=n_dimensions)
34+
solver = CMA_ES(
35+
black_box=f,
36+
x0=x0,
37+
y0=y0,
38+
initial_mean=initial_mean,
39+
initial_sigma=1.0,
40+
population_size=population_size,
41+
)
42+
43+
solver.solve(max_iter=50)
44+
```

docs/protein-optimization/using_poli_baselines/line_bayesian_optimization.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
Line Bayesian Optimization (LineBO) is a version of [Bayesian Optimization](./bayesian_optimization.md) that restricts the optimization of the acquisition function to a single line in input space {cite:p}`Kirschner:LineBO:2019`. This line can either be selected at random, or can follow one of the coordinate directions.
88

9-
By default, we use `botorch`'s `SingleTaskGP` {cite+p}`Balandat:botorch:2020`.
9+
By default, we use `botorch`'s `SingleTaskGP` {cite:p}`Balandat:botorch:2020`.
1010

1111
## How to run
1212

0 commit comments

Comments
 (0)