Skip to content

Commit d28b728

Browse files
committed
Adds an intro to poli, cleans the registration chapter
1 parent 6361948 commit d28b728

File tree

7 files changed

+133
-15
lines changed

7 files changed

+133
-15
lines changed

docs/protein-optimization/_toc.yml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,11 @@ parts:
77
- caption: Using poli - the basics
88
numbered: true
99
chapters:
10-
- file: using_poli/desired_design_patterns/intro_to_poli.md
11-
- file: using_poli/desired_design_patterns/registering_an_objective_function.md
12-
- file: using_poli/desired_design_patterns/defining_a_problem_solver.md
13-
- file: using_poli/desired_design_patterns/optimizing_an_objective_function.md
10+
- file: using_poli/the_basics/intro_to_poli.md
11+
- file: using_poli/the_basics/registering_an_objective_function.md
12+
- file: using_poli/the_basics/defining_a_problem_solver.md
13+
- file: using_poli/the_basics/optimizing_an_objective_function.md
14+
- file: using_poli/the_basics/diving_deeper.md
1415
- caption: Using poli - examples
1516
numbered: true
1617
chapters:

docs/protein-optimization/understanding_foldx/01-single-mutation-using-foldx/tmp/Raw_101m_Repair.fxout

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,5 @@ Output type: BuildModel
99
Pdb total energy Backbone Hbond Sidechain Hbond Van der Waals Electrostatics Solvation Polar Solvation Hydrophobic Van der Waals clashes entropy sidechain entropy mainchain sloop_entropy mloop_entropy cis_bond torsional clash backbone clash helix dipole water bridge disulfide electrostatic kon partial covalent bonds energy Ionisation Entropy Complex
1010
101m_Repair_1.pdb -31.7457 -141.841 -48.2413 -177.827 -8.5183 243.998 -235.896 3.3294 104.051 231.196 0 0 0 5.25497 157.841 -8.81857 0 0 0 0 1.5666 0
1111
WT_101m_Repair_1.pdb -34.3436 -141.831 -47.9784 -179.662 -8.13848 243.99 -239.232 3.40664 105.266 231.722 0 0 0 5.28162 157.882 -8.73035 0 0 0 0 1.56224 0
12+
101m_Repair_1.pdb -31.7457 -141.841 -48.2413 -177.827 -8.5183 243.998 -235.896 3.3294 104.051 231.196 0 0 0 5.25497 157.841 -8.81857 0 0 0 0 1.5666 0
13+
WT_101m_Repair_1.pdb -34.3436 -141.831 -47.9784 -179.662 -8.13848 243.99 -239.232 3.40664 105.266 231.722 0 0 0 5.28162 157.882 -8.73035 0 0 0 0 1.56224 0

docs/protein-optimization/using_poli/the_basics/defining_a_problem_solver.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,4 +114,6 @@ class RandomMutation(AbstractSolver):
114114
return next_x
115115
```
116116

117-
Pretty lean! Notice how **the `next_candidate` method could perform all sorts of complicated logic** like latent space Bayesian Optimization, evolutionary algorithms...
117+
Pretty lean! Notice how **the `next_candidate` method could perform all sorts of complicated logic** like latent space Bayesian Optimization, evolutionary algorithms... Moreover, the conda environment where you do the optimization has nothing to do with the enviroment where the objective function was defined: `poli` is set up in such a way that you can query the objective functions without having to worry!
118+
119+
In the next chapter, we apply this solver to the `aloha` problem we defined in [the first chapter](./registering_an_objective_function.md).
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# Diving deeper: how does poli work under the hood?
2+
3+
```{contents}
4+
```
5+
6+
TODO: write.
Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
11
# What is poli?
22

3-
[TODO: write an intro to the library, and discuss problems, how to register them, and how to call them.]
3+
`poli` is a library for registering black box optimization functions, with a special focus on *discrete* sequence optimization. It stands for *Protein Optimization Library*, since some of the work done on drug design is done through representing proteins as discrete sequences, or sentences of amino acids.
44

5+
We also build `poli-baselines` on top, allowing you to define black box optimization algorithms for discrete sequences.
6+
7+
These next chapters detail a basic example of how to use `poli` and `poli-baselines`. Continue to [the next chapter](./registering_an_objective_function.md)!
8+
9+
After these, feel free to dive deeper into how `poli` works underneath in [the chapter about the details](./diving_deeper.md).
Lines changed: 97 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,101 @@
11
# Optimizing an objective function
22

3-
In this chapter, we discuss how an objective function should be optimized in POLi _ideally_. The current implementation might diverge for now, but this is what we aim for.
3+
```{contents}
4+
```
45

5-
## Getting a list of registered objective functions
6+
In this chapter, we combine what we have discussed in the previous two chapters to optimize a black box objective function using `poli` and its baselines. In particular, we'll solve the `aloha` problem using discrete random mutations.
67

7-
[TODO: finish this]
8+
> This tutorial follows `optimizing_aloha.py` in `poli-baselines/examples`.
9+
10+
## Prerequisites
11+
12+
Before running this, you need to make sure you have
13+
14+
- You have both `poli` and `poli_baselines` installed.
15+
- run [the first chapter on registering black box functions](./registering_an_objective_function.md).
16+
- read [the second chapter on implementing solvers](./defining_a_problem_solver.md)
17+
18+
By the end, you should have registered the `aloha` problem.
19+
20+
## Is aloha registered?
21+
22+
We can start by checking that the `aloha` problem is indeed among the registered objectives:
23+
24+
```python
25+
# optimizing_aloha.py
26+
from poli.core.registry import get_problems
27+
28+
if __name__ == "__main__":
29+
assert "aloha" in get_problems()
30+
```
31+
32+
This script should run without raising any problems.
33+
34+
:::{admonition} Is aloha not registered?
35+
:class: dropdown
36+
37+
If the past snippet fails and raises an `AssertionError`, then it's likely you haven't registered `aloha` as a problem. Check [the first chapter for the process of registering this problem](./registering_an_objective_function.md).
38+
39+
:::
40+
41+
## Instancing the problem and solver
42+
43+
Since the problem is registered, optimizing it is really easy!
44+
45+
```python
46+
# optimizing_aloha.py
47+
from poli import objective_factory
48+
from poli.core.registry import get_problems
49+
from poli_baselines.solvers.simple.random_mutation import RandomMutation
50+
51+
if __name__ == "__main__":
52+
assert "aloha" in get_problems()
53+
54+
# Creating an instance of the problem
55+
problem_info, f, x0, y0, run_info = objective_factory.create(
56+
name="aloha", caller_info=None, observer=None
57+
)
58+
59+
# Creating an instance of the solver
60+
solver = RandomMutation(
61+
black_box=f,
62+
x0=x0,
63+
y0=y0,
64+
alphabet=problem_info.get_alphabet(),
65+
)
66+
```
67+
68+
## Optimizing
69+
70+
Solvers in `poli_baselines` have a `solve` method. Its only required argument is the number of iterations we want to run the optimization for (`max_iters: int`). Other keyword arguments include e.g. `break_at_performance: float = None`, or `verbose: bool = False`.
71+
72+
Once instantiated, the solver can optimize our `aloha` problem easily:
73+
74+
```python
75+
# optimizing_aloha.py
76+
77+
...
78+
79+
if __name__ == "__main__"
80+
...
81+
82+
# Running the optimization for 1000 steps,
83+
# breaking if we find a performance above 5.0,
84+
# and printing a small summary at each step.
85+
solver.solve(max_iter=1000, break_at_performance=5.0, verbose=True)
86+
print(solver.get_best_solution())
87+
```
88+
89+
Just by running random mutations, you can find the "ALOHA" string in usually less than 1000 random mutations.
90+
91+
## Conclusion
92+
93+
In this tutorial we used `RandomMutations` to optimize a toy example: the `aloha` problem described in [the first chapter](./registering_an_objective_function.md).
94+
95+
This concludes the "Getting Started" section of this tutorial. The key takeaways are these:
96+
97+
1. With `poli`, you can register black box objective functions which, when instantiated, will run on an independent process in a custom environment you specify at registration.
98+
2. `poli_baselines` allows you to define black box optimization algorithms that operate well with `poli`'s registered problems.
99+
3. `poli_baselines` also comes with several solvers, including `RandomMutations`.
100+
101+
The next chapter discusses a more advanced set-up: registering a black-box objective function with Java dependencies, as well as `torch`, and loading up certain autoencoders.

docs/protein-optimization/using_poli/the_basics/registering_an_objective_function.md

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ class AlohaBlackBox(AbstractBlackBox):
3737
super().__init__(L=L)
3838

3939
# The only method you have to define
40-
def _black_box(self, x: np.ndarray) -> np.ndarray:
40+
def _black_box(self, x: np.ndarray, context: dict = None) -> np.ndarray:
4141
"""
4242
A function that takes 5-letter words
4343
in numpy arrays (as letters, not as
@@ -48,7 +48,7 @@ class AlohaBlackBox(AbstractBlackBox):
4848
return np.sum(matches, axis=1, keepdims=True)
4949
```
5050

51-
As the code says, the only method you need to define is `_black_box`, returning a numpy array of size `[1, 1]`. `AbstractBlackBox` takes it from there, making sure that the length of the inputs is correct and matches `L`. You can opt-out of length-checking by saying `L=np.inf` in the `__init__`.[^details-on-black-box]
51+
As the code says, the only method you need to define is `_black_box(x: np.ndarray, context: dict = None)`, returning a numpy array of size `[1, 1]`. `AbstractBlackBox` takes it from there, making sure that the length of the inputs is correct and matches `L`. You can opt-out of length-checking by saying `L=np.inf` in the `__init__`.[^details-on-black-box]
5252

5353
[^details-on-black-box]: You can check the exact implementation in [TODOADD]().
5454

@@ -65,6 +65,7 @@ Let's build a problem factory for the `AlohaBlackBox`:
6565
```python
6666
# registering_aloha.py
6767
from typing import Tuple
68+
from string import ascii_uppercase
6869

6970
import numpy as np
7071

@@ -113,7 +114,7 @@ class AlohaProblemFactory(AbstractProblemFactory):
113114

114115
First step is always **creating a conda environment for your problem**. In this case, we could do with just the base enviroment. However, for completion in the presentation, we will create a conda enviroment called `poli_aloha`. This is the enviroment description (which can be found under `environment.yml` in the examples folder for `aloha`):
115116

116-
TODO: remove the dependency on click, and move the dependency to our github after merging.
117+
TODO: move the dependency to our github after merging.
117118

118119
```yml
119120
# environment.yml
@@ -125,7 +126,6 @@ dependencies:
125126
- pip
126127
- pip:
127128
- numpy
128-
- click
129129
- "git+https://github.com/miguelgondu/poli.git"
130130
```
131131
@@ -176,6 +176,14 @@ if __name__ == "__main__":
176176

177177
:::
178178

179+
:::{admonition} Where is this problem registered?
180+
181+
`poli` registers this objective as a shell file `.sh` inside `~/.poli_objectives`.
182+
183+
As you can check, this script runs `poli/objective.py` inside the conda environment you specified on registration. `objective.py` is the main workhorse of `poli`: it starts a process in which the objective function waits for next inputs.
184+
185+
:::
186+
179187
## Calling the registered problem
180188

181189
To check that we can indeed call the problem from somewhere else, let's write a second file called `querying_aloha.py` where we instantiate the objective function and query it. We emphasize that this second file can run on **any other conda environment** (as long as you have `poli` installed, and the problem registered).
@@ -197,7 +205,7 @@ If you are running this script from an environment that has `poli`, and that has
197205
True
198206
```
199207

200-
Amazing. Let's remove this print. Now we can create an instance of the problem, making absolutely sure the problem is registered:
208+
Amazing. Let's remove this print. Now we can create an instance of the problem:
201209

202210
```python
203211
# querying_aloha.py
@@ -236,4 +244,4 @@ In this tutorial you
236244
237245
This is a trivial example, since the only dependency is numpy. In other examples you will see problems with more subtle dependencies (e.g. Java runtimes, torch, cheminformatics tools like `FoldX`, `RDKit`, or the therapeutics data commons...).
238246
239-
A good next step: checking how to define problem solvers (i.e. black box optimization algorithms).
247+
In the next chapter, we will define a simple "Problem Solver" (i.e. a black box optimization algorithm), and in the one after that we will apply it to solve this `aloha` problem.

0 commit comments

Comments
 (0)