|
| 1 | +# Contributing a new problem to the repository |
| 2 | + |
| 3 | +This tutorial covers how problems are structured in the repository, and what it takes to add a new one. |
| 4 | + |
| 5 | +## The structure of a problem |
| 6 | + |
| 7 | +If you take a look at the source code of `poli`, you will find a folder called `objective_repository`. This is where all the objective functions in the repository live. The structure of a generic problem goes as follows |
| 8 | + |
| 9 | +```bash |
| 10 | +poli/objective_repository |
| 11 | +├── problem_name # Has the name of the registered problem (exactly) |
| 12 | +│ ├── environment.yml # The env. where ./register.py and the problem will run |
| 13 | +│ └── register.py # Definition and registration of the problem |
| 14 | +``` |
| 15 | + |
| 16 | +You can also have as many other files as you want. Think of the folder `.../problem_name` as a small project as of itself: you can have Python dependencies that will get appended to the path at runtime. |
| 17 | + |
| 18 | +**For example:**, let's take a look at the problem folder of `super_mario_bros` |
| 19 | + |
| 20 | +```bash |
| 21 | +├── super_mario_bros |
| 22 | +│ ├── environment.yml |
| 23 | +│ ├── register.py |
| 24 | +│ ├── example.pt # < -- |
| 25 | +│ ├── level_utils.py # < -- |
| 26 | +│ ├── model.py # < -- extra files needed |
| 27 | +│ ├── readme.md # < -- for the simulation |
| 28 | +│ ├── simulator.jar # < -- to run. |
| 29 | +│ └── simulator.py # < -- |
| 30 | +``` |
| 31 | + |
| 32 | +:::{warning} |
| 33 | +As a general rule: **don't assume that your files will be there after `pip install git+...`**. File endings different from `.py` and `.yml` will be ignored by `pip` during installation. An alternative, then, is to download them in your `register.py`. |
| 34 | +::: |
| 35 | + |
| 36 | +## A generic `register.py` |
| 37 | + |
| 38 | +The average `register.py` has the following structure |
| 39 | + |
| 40 | +```python |
| 41 | +# An average register.py |
| 42 | +from typing import Tuple |
| 43 | + |
| 44 | +import numpy as np |
| 45 | + |
| 46 | +from poli.core.abstract_black_box import AbstractBlackBox |
| 47 | +from poli.core.abstract_problem_factory import AbstractProblemFactory |
| 48 | +from poli.core.problem_setup_information import ProblemSetupInformation |
| 49 | + |
| 50 | + |
| 51 | +class YourBlackBox(AbstractBlackBox): |
| 52 | + def __init__(self, L: int = np.inf): |
| 53 | + super().__init__(L=L) |
| 54 | + |
| 55 | + # The only method you have to define |
| 56 | + def _black_box(self, x: np.ndarray, context: dict = None) -> np.ndarray: |
| 57 | + return ... |
| 58 | + |
| 59 | + |
| 60 | +class YourProblemFactory(AbstractProblemFactory): |
| 61 | + def get_setup_information(self) -> ProblemSetupInformation: |
| 62 | + # The tokens of your alphabet |
| 63 | + alphabet_symbols = [...] |
| 64 | + |
| 65 | + # The encoding |
| 66 | + alphabet = {symbol: i for i, symbol in enumerate(alphabet_symbols)} |
| 67 | + |
| 68 | + # A description of the problem |
| 69 | + # See more in the chapter about defining |
| 70 | + # problem factories. |
| 71 | + return ProblemSetupInformation( |
| 72 | + name="your_problem", # HAS to be the same name as the parent folder. |
| 73 | + max_sequence_length=..., |
| 74 | + aligned=..., |
| 75 | + alphabet=alphabet, |
| 76 | + ) |
| 77 | + |
| 78 | + def create(self, seed: int = 0, keyword_1 = ..., keyword_2 = ...) -> Tuple[AbstractBlackBox, np.ndarray, np.ndarray]: |
| 79 | + # Manipulate keywords you might need at creation time... |
| 80 | + ... |
| 81 | + |
| 82 | + # The maximum length you defined above |
| 83 | + L = self.get_setup_information().get_max_sequence_length() |
| 84 | + |
| 85 | + # Creating your black box function |
| 86 | + f = YourBlackBox(L=L) |
| 87 | + |
| 88 | + # Your first input (an np.array) |
| 89 | + x0 = ... |
| 90 | + |
| 91 | + return f, x0, f(x0) |
| 92 | + |
| 93 | + |
| 94 | +if __name__ == "__main__": |
| 95 | + from poli.core.registry import register_problem |
| 96 | + |
| 97 | + # Once we have created a simple conda enviroment |
| 98 | + # (see the environment.yml file in this folder), |
| 99 | + # we can register our problem s.t. it uses |
| 100 | + # said conda environment. |
| 101 | + your_problem_factory = YourProblemFactory() |
| 102 | + register_problem( |
| 103 | + your_problem_factory, |
| 104 | + conda_environment_name="your_env", # This is the env specified |
| 105 | + # by your environment.yml |
| 106 | + ) |
| 107 | + |
| 108 | +``` |
| 109 | + |
| 110 | +That is, **the script creates and registers** your problem factory. |
| 111 | + |
| 112 | +:::{warning} |
| 113 | +It is important that name of your problem should be the name of the folder it's contained, exactly. (We advice using `camel_case`). |
| 114 | +::: |
| 115 | + |
| 116 | +## A generic `environment.yml` |
| 117 | + |
| 118 | +You will usually develop your black-box objective function inside an environment, say `your_env`. You need to specify all these requirements in the `environment.yml`, generically: |
| 119 | + |
| 120 | +```yml |
| 121 | +name: your_env |
| 122 | +channels: |
| 123 | + - defaults |
| 124 | +dependencies: |
| 125 | + - python=3.9 |
| 126 | + - pip |
| 127 | + - pip: |
| 128 | + - numpy |
| 129 | + - "git+https://github.com/miguelgondu/poli.git" |
| 130 | + - YOUR OTHER DEPENDENCIES |
| 131 | +``` |
| 132 | +
|
| 133 | +This environment will be created (if it doesn't exist yet), and will be used to run `register.py`. |
| 134 | + |
| 135 | +:::{admonition} Why `conda`? |
| 136 | +Conda environments can be quite good! For example, the `super_mario_bros` environment contains a Java runtime. This is the `environment.yml` for said problem: |
| 137 | + |
| 138 | +```yml |
| 139 | +name: poli__mario |
| 140 | +channels: |
| 141 | + - defaults |
| 142 | + - conda-forge |
| 143 | + - pytorch |
| 144 | +dependencies: |
| 145 | + - python=3.9 |
| 146 | + - conda-forge::openjdk |
| 147 | + - cpuonly |
| 148 | + - pytorch |
| 149 | + - pip |
| 150 | + - pip: |
| 151 | + - numpy |
| 152 | + - click |
| 153 | + - "git+https://github.com/miguelgondu/poli.git" |
| 154 | +
|
| 155 | +``` |
| 156 | + |
| 157 | +It installs an `openjdk` that will be added to the path when the environment is active. Moreover, you can also hack your way around installing conda and creating conda environments inside Colab. |
| 158 | + |
| 159 | +::: |
| 160 | + |
| 161 | +## Testing your installation |
| 162 | + |
| 163 | +If you |
| 164 | +1. have put your new problem is inside `poli/objective_repository`, |
| 165 | +2. have a `register.py` that creates and register your problem factory, |
| 166 | +3. have an `environment.yml` that describes the environment you use, |
| 167 | + |
| 168 | +then you should be set! |
| 169 | + |
| 170 | +You can test that your problem is registerable by creating a fresh environment that includes poli, and running |
| 171 | + |
| 172 | +```bash |
| 173 | +$ python -c "from poli.core.registry import get_problems; print(get_problems())" |
| 174 | +[...] # A list, without your problem in it. |
| 175 | +``` |
| 176 | + |
| 177 | +Your problem is not registered yet, so don't fret. You can check _if_ you can register it by running |
| 178 | + |
| 179 | +```bash |
| 180 | +$ python -c "from poli.objective_repository import AVAILABLE_OBJECTIVES; print(AVAILABLE_OBJECTIVES)" |
| 181 | +[..., "your_problem", ...] # If all goes well, you should see "your_problem" here. |
| 182 | +``` |
| 183 | + |
| 184 | +If you can find your problem in this list, then you're set! You should be able to run |
| 185 | + |
| 186 | +```python |
| 187 | +from poli import objective_factory |
| 188 | +
|
| 189 | +problem_info, f, x0, y0, _ = objective_factory.create( |
| 190 | + name="your_problem", |
| 191 | + ..., |
| 192 | + keyword_1=..., # <-- Keywords you (maybe) needed |
| 193 | + keyword_2=... # <-- at your_factory.create(...) |
| 194 | +) |
| 195 | +``` |
| 196 | + |
| 197 | +`poli` will ask you to confirm that you want to register your problem (you can force the registration by passing `force_register=True` to `objective_factory.create`). |
| 198 | + |
| 199 | +## Submitting a pull request |
| 200 | + |
| 201 | +If you want to share your problem with us, feel free to create a pull request in our repository: https://github.com/MachineLearningLifeScience/poli |
0 commit comments