Skip to content

Commit 6dd95e3

Browse files
committed
Adds a chapter on how to add objective functions to the repository
1 parent 28a941f commit 6dd95e3

File tree

4 files changed

+231
-1
lines changed

4 files changed

+231
-1
lines changed

docs/protein-optimization/_toc.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@ parts:
2323
- file: using_poli/objective_repository/foldx_stability.md
2424
- file: using_poli/objective_repository/super_mario_bros.md
2525
- file: using_poli/objective_repository/small_molecule.md
26+
- caption: "Contributing"
27+
chapters:
28+
- file: contributing/a_new_problem.md
29+
- file: contributing/a_new_solver.md
2630
- caption: "Appendix: Understanding FoldX"
2731
chapters:
2832
- file: understanding_foldx/00-installing-foldx.md
Lines changed: 201 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,201 @@
1+
# Contributing a new problem to the repository
2+
3+
This tutorial covers how problems are structured in the repository, and what it takes to add a new one.
4+
5+
## The structure of a problem
6+
7+
If you take a look at the source code of `poli`, you will find a folder called `objective_repository`. This is where all the objective functions in the repository live. The structure of a generic problem goes as follows
8+
9+
```bash
10+
poli/objective_repository
11+
├── problem_name # Has the name of the registered problem (exactly)
12+
│   ├── environment.yml # The env. where ./register.py and the problem will run
13+
│   └── register.py # Definition and registration of the problem
14+
```
15+
16+
You can also have as many other files as you want. Think of the folder `.../problem_name` as a small project as of itself: you can have Python dependencies that will get appended to the path at runtime.
17+
18+
**For example:**, let's take a look at the problem folder of `super_mario_bros`
19+
20+
```bash
21+
├── super_mario_bros
22+
│   ├── environment.yml
23+
│   ├── register.py
24+
│   ├── example.pt # < --
25+
│   ├── level_utils.py # < --
26+
│   ├── model.py # < -- extra files needed
27+
│   ├── readme.md # < -- for the simulation
28+
│   ├── simulator.jar # < -- to run.
29+
│   └── simulator.py # < --
30+
```
31+
32+
:::{warning}
33+
As a general rule: **don't assume that your files will be there after `pip install git+...`**. File endings different from `.py` and `.yml` will be ignored by `pip` during installation. An alternative, then, is to download them in your `register.py`.
34+
:::
35+
36+
## A generic `register.py`
37+
38+
The average `register.py` has the following structure
39+
40+
```python
41+
# An average register.py
42+
from typing import Tuple
43+
44+
import numpy as np
45+
46+
from poli.core.abstract_black_box import AbstractBlackBox
47+
from poli.core.abstract_problem_factory import AbstractProblemFactory
48+
from poli.core.problem_setup_information import ProblemSetupInformation
49+
50+
51+
class YourBlackBox(AbstractBlackBox):
52+
def __init__(self, L: int = np.inf):
53+
super().__init__(L=L)
54+
55+
# The only method you have to define
56+
def _black_box(self, x: np.ndarray, context: dict = None) -> np.ndarray:
57+
return ...
58+
59+
60+
class YourProblemFactory(AbstractProblemFactory):
61+
def get_setup_information(self) -> ProblemSetupInformation:
62+
# The tokens of your alphabet
63+
alphabet_symbols = [...]
64+
65+
# The encoding
66+
alphabet = {symbol: i for i, symbol in enumerate(alphabet_symbols)}
67+
68+
# A description of the problem
69+
# See more in the chapter about defining
70+
# problem factories.
71+
return ProblemSetupInformation(
72+
name="your_problem", # HAS to be the same name as the parent folder.
73+
max_sequence_length=...,
74+
aligned=...,
75+
alphabet=alphabet,
76+
)
77+
78+
def create(self, seed: int = 0, keyword_1 = ..., keyword_2 = ...) -> Tuple[AbstractBlackBox, np.ndarray, np.ndarray]:
79+
# Manipulate keywords you might need at creation time...
80+
...
81+
82+
# The maximum length you defined above
83+
L = self.get_setup_information().get_max_sequence_length()
84+
85+
# Creating your black box function
86+
f = YourBlackBox(L=L)
87+
88+
# Your first input (an np.array)
89+
x0 = ...
90+
91+
return f, x0, f(x0)
92+
93+
94+
if __name__ == "__main__":
95+
from poli.core.registry import register_problem
96+
97+
# Once we have created a simple conda enviroment
98+
# (see the environment.yml file in this folder),
99+
# we can register our problem s.t. it uses
100+
# said conda environment.
101+
your_problem_factory = YourProblemFactory()
102+
register_problem(
103+
your_problem_factory,
104+
conda_environment_name="your_env", # This is the env specified
105+
# by your environment.yml
106+
)
107+
108+
```
109+
110+
That is, **the script creates and registers** your problem factory.
111+
112+
:::{warning}
113+
It is important that name of your problem should be the name of the folder it's contained, exactly. (We advice using `camel_case`).
114+
:::
115+
116+
## A generic `environment.yml`
117+
118+
You will usually develop your black-box objective function inside an environment, say `your_env`. You need to specify all these requirements in the `environment.yml`, generically:
119+
120+
```yml
121+
name: your_env
122+
channels:
123+
- defaults
124+
dependencies:
125+
- python=3.9
126+
- pip
127+
- pip:
128+
- numpy
129+
- "git+https://github.com/miguelgondu/poli.git"
130+
- YOUR OTHER DEPENDENCIES
131+
```
132+
133+
This environment will be created (if it doesn't exist yet), and will be used to run `register.py`.
134+
135+
:::{admonition} Why `conda`?
136+
Conda environments can be quite good! For example, the `super_mario_bros` environment contains a Java runtime. This is the `environment.yml` for said problem:
137+
138+
```yml
139+
name: poli__mario
140+
channels:
141+
- defaults
142+
- conda-forge
143+
- pytorch
144+
dependencies:
145+
- python=3.9
146+
- conda-forge::openjdk
147+
- cpuonly
148+
- pytorch
149+
- pip
150+
- pip:
151+
- numpy
152+
- click
153+
- "git+https://github.com/miguelgondu/poli.git"
154+
155+
```
156+
157+
It installs an `openjdk` that will be added to the path when the environment is active. Moreover, you can also hack your way around installing conda and creating conda environments inside Colab.
158+
159+
:::
160+
161+
## Testing your installation
162+
163+
If you
164+
1. have put your new problem is inside `poli/objective_repository`,
165+
2. have a `register.py` that creates and register your problem factory,
166+
3. have an `environment.yml` that describes the environment you use,
167+
168+
then you should be set!
169+
170+
You can test that your problem is registerable by creating a fresh environment that includes poli, and running
171+
172+
```bash
173+
$ python -c "from poli.core.registry import get_problems; print(get_problems())"
174+
[...] # A list, without your problem in it.
175+
```
176+
177+
Your problem is not registered yet, so don't fret. You can check _if_ you can register it by running
178+
179+
```bash
180+
$ python -c "from poli.objective_repository import AVAILABLE_OBJECTIVES; print(AVAILABLE_OBJECTIVES)"
181+
[..., "your_problem", ...] # If all goes well, you should see "your_problem" here.
182+
```
183+
184+
If you can find your problem in this list, then you're set! You should be able to run
185+
186+
```python
187+
from poli import objective_factory
188+
189+
problem_info, f, x0, y0, _ = objective_factory.create(
190+
name="your_problem",
191+
...,
192+
keyword_1=..., # <-- Keywords you (maybe) needed
193+
keyword_2=... # <-- at your_factory.create(...)
194+
)
195+
```
196+
197+
`poli` will ask you to confirm that you want to register your problem (you can force the registration by passing `force_register=True` to `objective_factory.create`).
198+
199+
## Submitting a pull request
200+
201+
If you want to share your problem with us, feel free to create a pull request in our repository: https://github.com/MachineLearningLifeScience/poli
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Contributing a new black box optimization algorithm
2+
3+
[TODO: write] For now, check [the chapter on creating solvers](../using_poli/the_basics/defining_a_problem_solver.md).

docs/protein-optimization/index.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,4 +103,26 @@ A good place to start is the next chapter! [Go to Getting Started](./getting_sta
103103
<!-- ## Contents of the book
104104
105105
```{tableofcontents}
106-
``` -->
106+
``` -->
107+
108+
## Contribute problems or solvers
109+
110+
These are a couple of guides about how to contribute a new problem factory (i.e. black-box objective function), or a new optimization algorithm.
111+
112+
::::{grid}
113+
:gutter: 3
114+
115+
:::{grid-item-card} Contribute a new problem
116+
:link: ./contributing/a_new_problem.html
117+
:columns: 6
118+
A guide to contributing a new problem to the repository.
119+
:::
120+
121+
:::{grid-item-card} Contribute a new solver
122+
:link: ./contributing/a_new_solver.html
123+
:columns: 6
124+
How to contribute a new black-box optimization algorithm.
125+
:::
126+
127+
128+
::::

0 commit comments

Comments
 (0)