Skip to content

Commit 2714dc7

Browse files
committed
Update docs to current state
1 parent 5d5c5a7 commit 2714dc7

File tree

6 files changed

+91
-31
lines changed

6 files changed

+91
-31
lines changed

docs/protein-optimization/contributing/a_new_problem.md

Lines changed: 60 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ poli/objective_repository
1414
│   └── register.py # Definition and registration of the problem
1515
```
1616

17-
You can also have as many other files as you want. Think of the folder `.../problem_name` as a small project as of itself: you can have Python dependencies that will get appended to the path at runtime.
17+
You can also have as many other files as you want. Think of the folder `.../problem_name` as a small project as of itself: you can use any internal code you write here, since it'll be carried with `poli` at installation time.
1818

1919
**For example:** let's take a look at the problem folder of `super_mario_bros`
2020

@@ -26,7 +26,7 @@ You can also have as many other files as you want. Think of the folder `.../prob
2626
│   ├── example.pt # < --
2727
│   ├── level_utils.py # < --
2828
│   ├── model.py # < -- extra files needed
29-
│   ├── readme.md # < -- for the simulation
29+
│   ├── readme.md # < -- for the black box
3030
│   ├── simulator.jar # < -- to run.
3131
│   └── simulator.py # < --
3232
```
@@ -41,7 +41,7 @@ The average `register.py` has the following structure
4141

4242
```python
4343
# An average register.py
44-
from typing import Tuple
44+
from typing import Tuple, List
4545

4646
import numpy as np
4747

@@ -52,12 +52,25 @@ from poli.core.problem_setup_information import ProblemSetupInformation
5252
# Files that are in the same folder as
5353
# register will get added to the
5454
# PYTHONPATH at runtime.
55+
# Imagining you have your_local_dependency.py
56+
# in the same folder as register.py...
5557
from your_local_dependency import ...
5658

5759

5860
class YourBlackBox(AbstractBlackBox):
59-
def __init__(self, info: ProblemSetupInformation, batch_size: int = None):
60-
super().__init__(info=info, batch_size=batch_size)
61+
def __init__(
62+
self,
63+
info: ProblemSetupInformation,
64+
batch_size: int = None,
65+
parallelize: bool = False,
66+
num_workers: int = None
67+
):
68+
super().__init__(
69+
info=info,
70+
batch_size=batch_size,
71+
parallelize=parallelize,
72+
num_workers=num_workers,
73+
)
6174

6275
# The only method you have to define
6376
def _black_box(self, x: np.ndarray, context: dict = None) -> np.ndarray:
@@ -82,17 +95,26 @@ class YourProblemFactory(AbstractProblemFactory):
8295
def create(
8396
self,
8497
seed: int = None,
98+
batch_size: int = None,
99+
parallelize: bool = False,
100+
num_workers: int = None,
85101
your_keyword_1: str = ...,
86-
your_keyword_2: str = ...,
102+
your_keyword_2: int = ...,
103+
your_keyword_3: List[float] = ...,
87104
) -> Tuple[AbstractBlackBox, np.ndarray, np.ndarray]:
88105
# Manipulate keywords you might need at creation time...
89106
...
90107

91-
# The maximum length you defined above
108+
# Getting the problem information
92109
problem_info = self.get_setup_information()
93110

94111
# Creating your black box function
95-
f = YourBlackBox(info=problem_info)
112+
f = YourBlackBox(
113+
info=problem_info,
114+
batch_size=batch_size,
115+
parallelize=parallelize,
116+
num_workers=num_workers,
117+
)
96118

97119
# Your first input (an np.array[str])
98120
x0 = ...
@@ -122,6 +144,14 @@ That is, **the script creates and registers** your problem factory.
122144
It is important that name of your problem should be the name of the folder it's contained, **exactly**. (We advice using `camel_case`).
123145
:::
124146

147+
:::{warning}
148+
149+
`poli` is experimental. The input kwargs to the abstract black box
150+
and to the create method are under active development. Your IDE should
151+
tell you automatically, though!
152+
153+
:::
154+
125155
## A generic `environment.yml`
126156

127157
You will usually develop your black-box objective function inside an environment, say `your_env`. You need to specify all these requirements in the `environment.yml`, generically:
@@ -209,6 +239,28 @@ problem_info, f, x0, y0, _ = objective_factory.create(
209239

210240
`poli` will ask you to confirm that you want to register your problem (you can force the registration by passing `force_register=True` to `objective_factory.create`).
211241

242+
## (Optional) Making your problem be available if dependencies are met
243+
244+
At this point, you can run your objective function in an isolated process (which will literally import the factory and the black box function from the `register.py` you wrote). A better alternative is to get direct access to the object itself. Having access to the actual class makes your life easy, especially when it comes to using debugging tools like the ones in VSCode.
245+
246+
If you want to make your problem available if it can be imported, take a look at `src/poli/objective_repository/__init__.py`. Add a block like this one at the end of it:
247+
248+
```python
249+
#... the rest of poli/objective_repository/__init__.py
250+
251+
# A block you could add:
252+
try:
253+
from .your_problem.register import YourProblemFactory
254+
255+
AVAILABLE_PROBLEM_FACTORIES["your_problem"] = YourProblemFactory
256+
except ImportError: # Maybe you'll need to check for other errors.
257+
pass
258+
259+
```
260+
261+
`AVAILABLE_PROBLEM_FACTORIES` is keeping track of all the problem factories we can import without needing to isolate the process. If a problem is in this dict, it will appear when querying `get_problems()`, and it will be passed at `objective_factory.create` time.
262+
263+
212264
## Submitting a pull request
213265

214266
If you want to share your problem with us, feel free to create a pull request in our repository: https://github.com/MachineLearningLifeScience/poli

docs/protein-optimization/getting_started/getting_started.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ $ conda activate poli-base
2929
$ pip install numpy
3030
```
3131

32-
Right now, we only support two ways of installing `poli`: by cloning the repo and installing, or using `pip` and `git+`. [TODO: change from my fork to MLLS repo after merging]
32+
Right now, we only support two ways of installing `poli`: by cloning the repo and installing, or using `pip` and `git+`.
3333

3434
::::{tab-set}
3535

docs/protein-optimization/using_poli/optimization_examples/protein-stability-foldx/optimizing_protein_stability.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,7 @@
141141
"\n",
142142
"In general, it is a good idea to check how to create instances of individual problems in their documentation, since they might need extra inputs.\n",
143143
"\n",
144-
"For example, [`foldx_stability` only needs one extra keyword argument: a `wildtype_pdb_path`](../../objective_repository/foldx_stability.md). `poli` will hopefully remind you what you forgot with its error messages.\n",
144+
"For example, [`foldx_stability` only needs one extra keyword argument: a single/list of `wildtype_pdb_path`](../../objective_repository/foldx_stability.md). `poli` will hopefully remind you what you forgot with its error messages.\n",
145145
"\n",
146146
":::"
147147
]
@@ -229,11 +229,11 @@
229229
"source": [
230230
"`objective_factory.create` returns four things:\n",
231231
"\n",
232-
"1. a `problem_info` with a description of the problem, including useful attributes like `alphabet` or `max_sequence_length`. (See more [here (TODO: ADD)]()).\n",
232+
"1. a `problem_info` with a description of the problem, including useful attributes like `alphabet` or `max_sequence_length`. (See more [here (TODO: ADD API REFERENCE)]()).\n",
233233
"2. a black-box function `f: AbstractBlackBox` from `poli`.\n",
234234
"3. an initial design `x0: np.ndarray`, and\n",
235235
"4. an initial evaluation `y0: np.ndarray`.\n",
236-
"5. `run_info`, or the output of the observer (?)."
236+
"5. `run_info`, or the output of the observer's initialization (see more [in our chapter about making observers](../../the_basics/defining_an_observer.ipynb))."
237237
]
238238
},
239239
{

docs/protein-optimization/using_poli/the_basics/defining_an_observer.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -177,7 +177,7 @@
177177
"\n",
178178
":::{warning}\n",
179179
"\n",
180-
"Remember that this is a simple example! We are essentially re-inventing the wheel. You should write more complex logic for logging, or use libraries like `tensorflow`, `mlflow` or `wandb`.\n",
180+
"Remember that this is a simple example! We are essentially re-inventing the wheel. You should write more complex logic for logging, or use libraries like `tensorboard`, `mlflow` or `wandb`.\n",
181181
"\n",
182182
":::"
183183
]

docs/protein-optimization/using_poli/the_basics/intro_to_poli.ipynb

Lines changed: 21 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@
3636
"cell_type": "markdown",
3737
"metadata": {},
3838
"source": [
39-
"Black-box optimization algorithms inside `poli-baselines` are treated as **solvers** of **problems** defined using `poli`.\n",
39+
"Black-box optimization algorithms inside `poli-baselines` are treated as **solvers**, and the discrete objective functions of `poli` are described as **problems**.\n",
4040
"\n",
4141
"We propose to you the following process for using `poli-baselines`' optimizers, or developing your own:"
4242
]
@@ -66,7 +66,7 @@
6666
"name": "stdout",
6767
"output_type": "stream",
6868
"text": [
69-
"['aloha', 'white_noise']\n"
69+
"['aloha', 'gfp_select', 'white_noise']\n"
7070
]
7171
}
7272
],
@@ -91,7 +91,7 @@
9191
"name": "stdout",
9292
"output_type": "stream",
9393
"text": [
94-
"['aloha', 'foldx_rfp_lambo', 'foldx_sasa', 'foldx_stability', 'foldx_stability_and_sasa', 'rdkit_logp', 'rdkit_qed', 'super_mario_bros', 'white_noise']\n"
94+
"['aloha', 'drd3_docking', 'foldx_rfp_lambo', 'foldx_sasa', 'foldx_stability', 'foldx_stability_and_sasa', 'gfp_select', 'penalized_logp_lambo', 'rdkit_logp', 'rdkit_qed', 'sa_tdc', 'super_mario_bros', 'white_noise']\n"
9595
]
9696
}
9797
],
@@ -103,7 +103,13 @@
103103
"cell_type": "markdown",
104104
"metadata": {},
105105
"source": [
106-
"Each one of these objective functions can be run without modifying your environment, but you might need to check their prerequisites. We do our best to keep the list updated in [the introduction page](../../index.md), where you can find links to the requirements and installation descriptions for each one of these.\n",
106+
"Each one of these objective functions can be run without modifying your environment, but you might need to check their prerequisites. We do our best to keep the list updated in [the page on all objective functions](../objective_repository/all_objectives.md), where you can find links to the requirements and installation descriptions for each one of these.\n",
107+
"\n",
108+
":::{note}\n",
109+
"\n",
110+
"Some objective functions have more requirements, like installing external dependencies. Check [the page on all objective functions](../objective_repository/all_objectives.md) and click on the objective function you are interested in to get a detailed set of instructions on how to install and run it.\n",
111+
"\n",
112+
":::\n",
107113
"\n",
108114
"If the function still isn't there, **implement it yourself!** An example of how to do this can be found in `poli_baselines/examples/00_a_simple_objective_function_registration`, or in our chapter on [registering optimization functions](./registering_an_objective_function.md).\n",
109115
"\n",
@@ -112,7 +118,7 @@
112118
},
113119
{
114120
"cell_type": "code",
115-
"execution_count": 5,
121+
"execution_count": 7,
116122
"metadata": {},
117123
"outputs": [],
118124
"source": [
@@ -146,15 +152,15 @@
146152
},
147153
{
148154
"cell_type": "code",
149-
"execution_count": 6,
155+
"execution_count": 8,
150156
"metadata": {},
151157
"outputs": [
152158
{
153159
"name": "stdout",
154160
"output_type": "stream",
155161
"text": [
156162
"x0: [['1' '2' '3']]\n",
157-
"y0: [[-0.70206057]]\n"
163+
"y0: [[-0.85037523]]\n"
158164
]
159165
}
160166
],
@@ -180,16 +186,16 @@
180186
},
181187
{
182188
"cell_type": "code",
183-
"execution_count": 7,
189+
"execution_count": 9,
184190
"metadata": {},
185191
"outputs": [
186192
{
187193
"data": {
188194
"text/plain": [
189-
"array([['4', '2', '3']], dtype='<U1')"
195+
"array([['1', '7', '3']], dtype='<U1')"
190196
]
191197
},
192-
"execution_count": 7,
198+
"execution_count": 9,
193199
"metadata": {},
194200
"output_type": "execute_result"
195201
}
@@ -207,7 +213,7 @@
207213
},
208214
{
209215
"cell_type": "code",
210-
"execution_count": 11,
216+
"execution_count": 10,
211217
"metadata": {},
212218
"outputs": [
213219
{
@@ -216,7 +222,7 @@
216222
"['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']"
217223
]
218224
},
219-
"execution_count": 11,
225+
"execution_count": 10,
220226
"metadata": {},
221227
"output_type": "execute_result"
222228
}
@@ -248,7 +254,7 @@
248254
},
249255
{
250256
"cell_type": "code",
251-
"execution_count": 12,
257+
"execution_count": 11,
252258
"metadata": {},
253259
"outputs": [],
254260
"source": [
@@ -257,14 +263,14 @@
257263
},
258264
{
259265
"cell_type": "code",
260-
"execution_count": 13,
266+
"execution_count": 12,
261267
"metadata": {},
262268
"outputs": [
263269
{
264270
"name": "stdout",
265271
"output_type": "stream",
266272
"text": [
267-
"[['1' '5' '6']]\n"
273+
"[['1' '2' '5']]\n"
268274
]
269275
}
270276
],

docs/protein-optimization/using_poli/the_basics/registering_an_objective_function.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,11 @@ class OurAlohaProblemFactory(AbstractProblemFactory):
8080
alphabet=alphabet,
8181
)
8282

83-
def create(self, seed: int = None) -> Tuple[AbstractBlackBox, np.ndarray, np.ndarray]:
83+
# Adding **kwargs is necessary, since several things usually
84+
# get passed to the create method at initialization.
85+
def create(
86+
self, seed: int = None, **kwargs
87+
) -> Tuple[AbstractBlackBox, np.ndarray, np.ndarray]:
8488
problem_info = self.get_setup_information()
8589
f = OurAlohaBlackBox(info=problem_info)
8690
x0 = np.array([["A", "L", "O", "O", "F"]])
@@ -102,8 +106,6 @@ Check the exact implementation on [`poli/objective_repository/aloha/register.py`
102106

103107
First step is always **creating a conda environment for your problem**. In this case, we could do with just the base enviroment. However, for completion in the presentation, we will create a conda enviroment called `poli_aloha`. This is the enviroment description (which can be found under `environment.yml` in the examples folder for `aloha`):
104108

105-
TODO: move the dependency to our github after merging.
106-
107109
```yml
108110
# environment.yml
109111
name: poli_aloha_problem

0 commit comments

Comments
 (0)