Updates the installation from dev to master

miguelgondu · miguelgondu · commit da7a7ef8fbe7 · 2023-08-30T09:42:30.000+02:00
diff --git a/docs/protein-optimization/contributing/a_new_problem.md b/docs/protein-optimization/contributing/a_new_problem.md
@@ -138,7 +138,7 @@ dependencies:
   - pip
   - pip:
     - numpy
-    - "git+https://github.com/MachineLearningLifeScience/poli.git"
+    - "git+https://github.com/MachineLearningLifeScience/poli.git@master"
     - YOUR OTHER DEPENDENCIES
 ```
 
@@ -162,7 +162,7 @@ dependencies:
   - pip:
     - numpy
     - click
-    - "git+https://github.com/MachineLearningLifeScience/poli.git"
+    - "git+https://github.com/MachineLearningLifeScience/poli.git@master"
 
 ```
 
diff --git a/docs/protein-optimization/getting_started/getting_started.md b/docs/protein-optimization/getting_started/getting_started.md
@@ -39,7 +39,7 @@ If you are not interested in debugging, you can simply run
 
 ```bash
 # in the poli-base env
-pip install git+https://github.com/MachineLearningLifeScience/poli.git
+pip install git+https://github.com/MachineLearningLifeScience/poli.git@master
 ```
 
 :::
@@ -55,6 +55,8 @@ $ cd ./poli
 $ pip install -e .
 ```
 
+A stable version can be found on the `master` branch, the bleeding-edge is on `dev`.
+
 :::
 
 ::::
@@ -71,10 +73,9 @@ To make sure everything went well, you can test your `poli` installation by runn
 
 ```bash
 $ python -c "from poli.core.registry import get_problems ; print(get_problems())"
-[]
+['aloha', 'white_noise']
 ```
-
-If the installation isn't fresh/the only one in your system, you might actually get some registered problems.
+In general: **all problems available will appear**. These two (`aloha` and `white_noise`) are available by default since their only requirements are `numpy` and `poli`. If the installation isn't fresh/the only one in your system, you might actually get more problems.
 
 ## Running `poli` on Colab
 
@@ -83,12 +84,7 @@ With a little effort, you can run `poli` on Colab. [Check this example](https://
 
 ## Your first `poli` script
 
-As you might have noticed, you can get a list of the registered problems using the `get_problems` method inside `poli.core.registry`. You can also get a list of objective functions available for installing/registration using `from poli.objective_repository import AVAILABLE_OBJECTIVES`:
-
-```bash
-$ python -c "from poli.objective_repository import AVAILABLE_OBJECTIVES ; print(AVAILABLE_OBJECTIVES)"
-[..., 'white_noise']
-```
+As you might have noticed, you can get a list of the registered problems using the `get_problems` method inside `poli.core.registry`.
 
 Let's write a small script that installs `white_noise` from the repository:
 
@@ -104,13 +100,9 @@ for _ in range(5):
     print(f"f(x) = {f(x)}")
 ```
 
-If we run this script, `poli` will ask us to confirm that we want to register/install `"white_noise"` as an objective function (you can deactivate this confirmation step by passing the flag `force_register=True` to `.create`). Afterwards, it will print 5 evaluations of the objective function on the same input.
+If we run this script, it will print 5 evaluations of the objective function on the same input.
 
-:::{warning}
-
-In the registration process, `poli` creates a `conda` environment, and **executes a shell script**. Be wary of objective functions you find in the wild.
-
-:::
+`white_noise` is a trivial example. We include plenty of examples on how to register objective functions that are more complex, including e.g. [computing the Quantitative Estimate of Druglikeness of a small molecule](../using_poli/objective_repository/rdkit_qed.md) or, if you have the `foldx` simulator installed, [how to compute the stability of a protein given a `.pdb` file](../using_poli/optimization_examples/protein-stability-foldx/optimizing_protein_stability.ipynb).
 
 ## Conclusion
 
diff --git a/docs/protein-optimization/understanding_foldx/01-single-mutation-using-foldx/index.ipynb b/docs/protein-optimization/understanding_foldx/01-single-mutation-using-foldx/index.ipynb
@@ -899,7 +899,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.16"
+   "version": "3.9.17"
   },
   "orig_nbformat": 4
  },
diff --git a/docs/protein-optimization/using_poli/objective_repository/rdkit_logp.md b/docs/protein-optimization/using_poli/objective_repository/rdkit_logp.md
@@ -56,7 +56,7 @@ problem_info, f, x0, y0, run_info = objective_factory.create(
 x = np.array([[1]])
 
 # Querying:
-print(f(x))  # Should be close to 0.35978
+print(f(x))  # Should be close to 0.6361
 ```
 
 :::
diff --git a/docs/protein-optimization/using_poli/the_basics/intro_to_poli.ipynb b/docs/protein-optimization/using_poli/the_basics/intro_to_poli.ipynb
@@ -60,17 +60,13 @@
   {
    "cell_type": "code",
    "execution_count": 1,
-   "metadata": {
-    "vscode": {
-     "languageId": "plaintext"
-    }
-   },
+   "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "['white_noise']\n"
+      "['aloha', 'white_noise']\n"
      ]
     }
    ],
@@ -83,7 +79,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The output is a list of registered problems. If the function that you're interested in is not registered, you can check whether we have it in `poli`'s internal repository:"
+    "The output is a list of problems you can run without installing anything further. If the function that you're interested in is not in this list, you can check whether we have it in `poli`'s internal repository:"
    ]
   },
   {
@@ -95,20 +91,21 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "['aloha', 'super_mario_bros', 'white_noise']\n"
+      "['aloha', 'foldx_rfp_lambo', 'foldx_sasa', 'foldx_stability', 'foldx_stability_and_sasa', 'rdkit_logp', 'rdkit_qed', 'super_mario_bros', 'white_noise']\n"
      ]
     }
    ],
    "source": [
-    "from poli.objective_repository import AVAILABLE_OBJECTIVES\n",
-    "print(AVAILABLE_OBJECTIVES)"
+    "print(get_problems(include_repository=True))"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "If the function isn't there, **implement it yourself!** An example of how to do this can be found in `poli_baselines/examples/00_a_simple_objective_function_registration`, or in our chapter on [registering optimization functions](./registering_an_objective_function.md).\n",
+    "Each one of these objective functions can be run without modifying your environment, but you might need to check their prerequisites. We do our best to keep the list updated in [the introduction page](../../index.md), where you can find links to the requirements and installation descriptions for each one of these.\n",
+    "\n",
+    "If the function still isn't there, **implement it yourself!** An example of how to do this can be found in `poli_baselines/examples/00_a_simple_objective_function_registration`, or in our chapter on [registering optimization functions](./registering_an_objective_function.md).\n",
     "\n",
     "In what follows, we will use the `white_noise` objective function. You could drop-in your own function if desired."
    ]
@@ -121,10 +118,7 @@
    "source": [
     "from poli import objective_factory\n",
     "\n",
-    "problem_info, f, x0, y0, _ = objective_factory.create(\n",
-    "    name=\"white_noise\",\n",
-    "    force_register=True,\n",
-    ")"
+    "problem_info, f, x0, y0, _ = objective_factory.create(name=\"white_noise\")"
    ]
   },
   {
@@ -152,15 +146,15 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 5,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
       "x0: [['1' '2' '3']]\n",
-      "y0: [[1.58015034]]\n"
+      "y0: [[-0.12847371]]\n"
      ]
     }
    ],
@@ -187,23 +181,23 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "{'x': [array([['1', '2', '3']], dtype='<U1')], 'y': [array([[1.58015034]])]}\n"
+      "{'x': [array([['1', '2', '3']], dtype='<U1')], 'y': [array([[-0.12847371]])]}\n"
      ]
     },
     {
      "data": {
       "text/plain": [
-       "array([['1', '3', '3']], dtype='<U1')"
+       "array([['1', '8', '3']], dtype='<U1')"
       ]
      },
-     "execution_count": 14,
+     "execution_count": 6,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -222,25 +216,16 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 7,
    "metadata": {},
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "{'0': 0,\n",
-       " '1': 1,\n",
-       " '2': 2,\n",
-       " '3': 3,\n",
-       " '4': 4,\n",
-       " '5': 5,\n",
-       " '6': 6,\n",
-       " '7': 7,\n",
-       " '8': 8,\n",
-       " '9': 9}"
+       "['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']"
       ]
      },
-     "execution_count": 16,
+     "execution_count": 7,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -272,7 +257,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 8,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -281,14 +266,14 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 9,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "[['0' '2' '3']]\n"
+      "[['0' '2' '8']]\n"
      ]
     }
    ],
diff --git a/docs/protein-optimization/using_poli/the_basics/optimizing_an_objective_function.md b/docs/protein-optimization/using_poli/the_basics/optimizing_an_objective_function.md
@@ -19,7 +19,7 @@ By the end, you should have registered the `aloha` problem.
 
 ## Is aloha registered?
 
-We can start by checking that the `aloha` problem is indeed among the registered objectives:
+We can start by checking that the `aloha` problem is indeed among the available objectives:
 
 ```python
 # optimizing_aloha.py
@@ -34,7 +34,7 @@ This script should run without raising any problems.
 :::{admonition} Is aloha not registered?
 :class: dropdown
 
-If the past snippet fails and raises an `AssertionError`, then it's likely you haven't registered `aloha` as a problem. Check [the first chapter for the process of registering this problem](./registering_an_objective_function.md).
+If the past snippet fails and raises an `AssertionError`, then it's likely you haven't registered `aloha` as a problem, or that you don't have `numpy` installed. Check [the first chapter for the process of registering this problem](./registering_an_objective_function.md).
 
 :::
 
@@ -74,9 +74,10 @@ Once instantiated, the solver can optimize our `aloha` problem easily:
 ```python
 # optimizing_aloha.py
 
+# What we discuss above
 ...
 
-if __name__ == "__main__"
+if __name__ == "__main__":
     ...
     
     # Running the optimization for 1000 steps,
diff --git a/docs/protein-optimization/using_poli/the_basics/registering_an_objective_function.md b/docs/protein-optimization/using_poli/the_basics/registering_an_objective_function.md
@@ -3,7 +3,7 @@
 ```{contents}
 ```
 
-With `poli`, you can define and register black box objective functions. This page shows you how. For the entire script, check [`registering_aloha.py` TODO:ADD]() in the examples.
+With `poli`, you can define and register black box objective functions. This page shows you how. For the entire script, check [`registering_aloha.py`](https://github.com/MachineLearningLifeScience/poli/blob/master/examples/a_simple_objective_function_registration/registering_aloha.py) in the examples of `poli`.
 
 ## An example of a discrete black box function
 
@@ -48,9 +48,7 @@ class AlohaBlackBox(AbstractBlackBox):
         return np.sum(matches, axis=1, keepdims=True)
 ```
 
-As the code says, the only method you need to define is `_black_box(x: np.ndarray, context: dict = None)`, returning a numpy array of size `[1, 1]`. `AbstractBlackBox` takes it from there, making sure that the length of the inputs is correct and matches `L`. You can opt-out of length-checking by saying `L=np.inf` in the `__init__`.[^details-on-black-box]
-
-[^details-on-black-box]: You can check the exact implementation in [TODOADD]().
+As the code says, the only method you need to define is `_black_box(x: np.ndarray, context: dict = None)`, returning a numpy array of size `[1, 1]`. `AbstractBlackBox` takes it from there, making sure that the length of the inputs is correct and matches `L`. You can opt-out of length-checking by saying `L=np.inf` in the `__init__`.
 
 Black-box functions are wrapped around **problems**. A problem contains not only a black-box objective function, but also the relevant information for the discrete problem: the alphabet, maximum sequence length, whether the sequences are aligned... This next section discusses how to define problem factories, which create instances of the problem.
 
@@ -86,7 +84,7 @@ class AlohaProblemFactory(AbstractProblemFactory):
         alphabet = {symbol: i for i, symbol in enumerate(alphabet_symbols)}
 
         return ProblemSetupInformation(
-            name="aloha",
+            name="our_aloha",  # To separate it from the "aloha" problem
             max_sequence_length=5,
             aligned=True,
             alphabet=alphabet,
@@ -108,6 +106,12 @@ class AlohaProblemFactory(AbstractProblemFactory):
 
 **and that's it!** Once you have defined your problem factory, you need to register it to be able to call it on-the-go.
 
+:::{note}
+The exact implementation of the `aloha` problem is slightly different: we allow users to e.g. query both integers or strings. Integers are interpreted as the token ids according to the alphabet.
+
+Check the exact implementation on [`poli/objective_repository/aloha/register.py`](https://github.com/MachineLearningLifeScience/poli/blob/master/src/poli/objective_repository/aloha/register.py).
+:::
+
 ## Registering the problem factory
 
 ### Creating a conda environment for your problem
@@ -126,15 +130,15 @@ dependencies:
   - pip
   - pip:
     - numpy
-    - "git+https://github.com/MachineLearningLifeScience/poli.git"
+    - "git+https://github.com/MachineLearningLifeScience/poli.git@master"
 ```
 
 :::{admonition} Why conda? Why an entire environment?
 :class: dropdown
 
 Using `conda` allows us to package more than Python dependencies. In some examples you might see yourself needing to using e.g. a Java runtime. With `conda`, we can create environments that *include* these dependencies.
 
-For an example, [check the chapter on registering *Super Mario Bros* as a problem factory TODO:ADD]().
+For an example, [check the script that registers *Super Mario Bros* as a problem factory](https://github.com/MachineLearningLifeScience/poli/blob/master/src/poli/objective_repository/super_mario_bros/register.py).
 
 :::
 
@@ -195,7 +199,7 @@ Let's make sure that the problem is registered. The list of registered problems
 from poli.core.registry import get_problems
 
 if __name__ == "__main__":
-    print("aloha" in get_problems())
+    print("our_aloha" in get_problems())
 
 ```
 
@@ -216,7 +220,7 @@ from poli import objective_factory
 if __name__ == "__main__":
     # Creating an instance of the problem
     problem_info, f, x0, y0, run_info = objective_factory.create(
-        name="aloha", caller_info=None, observer=None
+        name="our_aloha", caller_info=None, observer=None
     )
     print(x0, y0)
 

Original file line number	Diff line number	Diff line change
`@@ -60,17 +60,13 @@`
`60`	`60`	`{`
`61`	`61`	`"cell_type": "code",`
`62`	`62`	`"execution_count": 1,`
`63`		`- "metadata": {`
`64`		`- "vscode": {`
`65`		`- "languageId": "plaintext"`
`66`		`- }`
`67`		`- },`
	`63`	`+ "metadata": {},`
`68`	`64`	`"outputs": [`
`69`	`65`	`{`
`70`	`66`	`"name": "stdout",`
`71`	`67`	`"output_type": "stream",`
`72`	`68`	`"text": [`
`73`		`- "['white_noise']\n"`
	`69`	`+ "['aloha', 'white_noise']\n"`
`74`	`70`	`]`
`75`	`71`	`}`
`76`	`72`	`],`
`@@ -83,7 +79,7 @@`
`83`	`79`	`"cell_type": "markdown",`
`84`	`80`	`"metadata": {},`
`85`	`81`	`"source": [`
`86`		- "The output is a list of registered problems. If the function that you're interested in is not registered, you can check whether we have it in `poli`'s internal repository:"
	`82`	+ "The output is a list of problems you can run without installing anything further. If the function that you're interested in is not in this list, you can check whether we have it in `poli`'s internal repository:"
`87`	`83`	`]`
`88`	`84`	`},`
`89`	`85`	`{`
`@@ -95,20 +91,21 @@`
`95`	`91`	`"name": "stdout",`
`96`	`92`	`"output_type": "stream",`
`97`	`93`	`"text": [`
`98`		`- "['aloha', 'super_mario_bros', 'white_noise']\n"`
	`94`	`+ "['aloha', 'foldx_rfp_lambo', 'foldx_sasa', 'foldx_stability', 'foldx_stability_and_sasa', 'rdkit_logp', 'rdkit_qed', 'super_mario_bros', 'white_noise']\n"`
`99`	`95`	`]`
`100`	`96`	`}`
`101`	`97`	`],`
`102`	`98`	`"source": [`
`103`		`- "from poli.objective_repository import AVAILABLE_OBJECTIVES\n",`
`104`		`- "print(AVAILABLE_OBJECTIVES)"`
	`99`	`+ "print(get_problems(include_repository=True))"`
`105`	`100`	`]`
`106`	`101`	`},`
`107`	`102`	`{`
`108`	`103`	`"cell_type": "markdown",`
`109`	`104`	`"metadata": {},`
`110`	`105`	`"source": [`
`111`		- "If the function isn't there, implement it yourself! An example of how to do this can be found in `poli_baselines/examples/00_a_simple_objective_function_registration`, or in our chapter on [registering optimization functions](./registering_an_objective_function.md).\n",
	`106`	`+ "Each one of these objective functions can be run without modifying your environment, but you might need to check their prerequisites. We do our best to keep the list updated in [the introduction page](../../index.md), where you can find links to the requirements and installation descriptions for each one of these.\n",`
	`107`	`+ "\n",`
	`108`	+ "If the function still isn't there, implement it yourself! An example of how to do this can be found in `poli_baselines/examples/00_a_simple_objective_function_registration`, or in our chapter on [registering optimization functions](./registering_an_objective_function.md).\n",
`112`	`109`	`"\n",`
`113`	`110`	"In what follows, we will use the `white_noise` objective function. You could drop-in your own function if desired."
`114`	`111`	`]`
`@@ -121,10 +118,7 @@`
`121`	`118`	`"source": [`
`122`	`119`	`"from poli import objective_factory\n",`
`123`	`120`	`"\n",`
`124`		`- "problem_info, f, x0, y0, _ = objective_factory.create(\n",`
`125`		`- " name=\"white_noise\",\n",`
`126`		`- " force_register=True,\n",`
`127`		`- ")"`
	`121`	`+ "problem_info, f, x0, y0, _ = objective_factory.create(name=\"white_noise\")"`
`128`	`122`	`]`
`129`	`123`	`},`
`130`	`124`	`{`
`@@ -152,15 +146,15 @@`
`152`	`146`	`},`
`153`	`147`	`{`
`154`	`148`	`"cell_type": "code",`
`155`		`- "execution_count": 13,`
	`149`	`+ "execution_count": 5,`
`156`	`150`	`"metadata": {},`
`157`	`151`	`"outputs": [`
`158`	`152`	`{`
`159`	`153`	`"name": "stdout",`
`160`	`154`	`"output_type": "stream",`
`161`	`155`	`"text": [`
`162`	`156`	`"x0: [['1' '2' '3']]\n",`
`163`		`- "y0: [[1.58015034]]\n"`
	`157`	`+ "y0: [[-0.12847371]]\n"`
`164`	`158`	`]`
`165`	`159`	`}`
`166`	`160`	`],`
`@@ -187,23 +181,23 @@`
`187`	`181`	`},`
`188`	`182`	`{`
`189`	`183`	`"cell_type": "code",`
`190`		`- "execution_count": 14,`
	`184`	`+ "execution_count": 6,`
`191`	`185`	`"metadata": {},`
`192`	`186`	`"outputs": [`
`193`	`187`	`{`
`194`	`188`	`"name": "stdout",`
`195`	`189`	`"output_type": "stream",`
`196`	`190`	`"text": [`
`197`		`- "{'x': [array([['1', '2', '3']], dtype='<U1')], 'y': [array([[1.58015034]])]}\n"`
	`191`	`+ "{'x': [array([['1', '2', '3']], dtype='<U1')], 'y': [array([[-0.12847371]])]}\n"`
`198`	`192`	`]`
`199`	`193`	`},`
`200`	`194`	`{`
`201`	`195`	`"data": {`
`202`	`196`	`"text/plain": [`
`203`		`- "array([['1', '3', '3']], dtype='<U1')"`
	`197`	`+ "array([['1', '8', '3']], dtype='<U1')"`
`204`	`198`	`]`
`205`	`199`	`},`
`206`		`- "execution_count": 14,`
	`200`	`+ "execution_count": 6,`
`207`	`201`	`"metadata": {},`
`208`	`202`	`"output_type": "execute_result"`
`209`	`203`	`}`
`@@ -222,25 +216,16 @@`
`222`	`216`	`},`
`223`	`217`	`{`
`224`	`218`	`"cell_type": "code",`
`225`		`- "execution_count": null,`
	`219`	`+ "execution_count": 7,`
`226`	`220`	`"metadata": {},`
`227`	`221`	`"outputs": [`
`228`	`222`	`{`
`229`	`223`	`"data": {`
`230`	`224`	`"text/plain": [`
`231`		`- "{'0': 0,\n",`
`232`		`- " '1': 1,\n",`
`233`		`- " '2': 2,\n",`
`234`		`- " '3': 3,\n",`
`235`		`- " '4': 4,\n",`
`236`		`- " '5': 5,\n",`
`237`		`- " '6': 6,\n",`
`238`		`- " '7': 7,\n",`
`239`		`- " '8': 8,\n",`
`240`		`- " '9': 9}"`
	`225`	`+ "['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']"`
`241`	`226`	`]`
`242`	`227`	`},`
`243`		`- "execution_count": 16,`
	`228`	`+ "execution_count": 7,`
`244`	`229`	`"metadata": {},`
`245`	`230`	`"output_type": "execute_result"`
`246`	`231`	`}`
`@@ -272,7 +257,7 @@`
`272`	`257`	`},`
`273`	`258`	`{`
`274`	`259`	`"cell_type": "code",`
`275`		`- "execution_count": 19,`
	`260`	`+ "execution_count": 8,`
`276`	`261`	`"metadata": {},`
`277`	`262`	`"outputs": [],`
`278`	`263`	`"source": [`
`@@ -281,14 +266,14 @@`
`281`	`266`	`},`
`282`	`267`	`{`
`283`	`268`	`"cell_type": "code",`
`284`		`- "execution_count": 20,`
	`269`	`+ "execution_count": 9,`
`285`	`270`	`"metadata": {},`
`286`	`271`	`"outputs": [`
`287`	`272`	`{`
`288`	`273`	`"name": "stdout",`
`289`	`274`	`"output_type": "stream",`
`290`	`275`	`"text": [`
`291`		`- "[['0' '2' '3']]\n"`
	`276`	`+ "[['0' '2' '8']]\n"`
`292`	`277`	`]`
`293`	`278`	`}`
`294`	`279`	`],`