Skip to content

Commit 47eaa2e

Browse files
committed
Adds references to some, and includes a bibliography
1 parent 7ba4b09 commit 47eaa2e

File tree

9 files changed

+81
-31
lines changed

9 files changed

+81
-31
lines changed

docs/protein-optimization/_config.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,10 @@ latex:
1919
bibtex_bibfiles:
2020
- references.bib
2121

22+
sphinx:
23+
config:
24+
bibtex_reference_style: author_year
25+
2226
# Information about where the book exists on the web
2327
# repository:
2428
# url: https://github.com/executablebooks/jupyter-book # Online location of your book

docs/protein-optimization/_toc.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,17 +19,23 @@ parts:
1919
- file: using_poli/optimization_examples/protein-stability-foldx/optimizing_protein_stability.ipynb
2020
- caption: Some objective functions available
2121
chapters:
22+
- file: using_poli/objective_repository/all_objectives.md
2223
- file: using_poli/objective_repository/white_noise.md
2324
- file: using_poli/objective_repository/aloha.md
2425
- file: using_poli/objective_repository/rdkit_qed.md
2526
- file: using_poli/objective_repository/rdkit_logp.md
27+
- file: using_poli/objective_repository/ddr3_docking.md
28+
- file: using_poli/objective_repository/penalized_logp_lambo.md
2629
- file: using_poli/objective_repository/foldx_stability.md
2730
- file: using_poli/objective_repository/foldx_sasa.md
2831
- file: using_poli/objective_repository/super_mario_bros.md
2932
- caption: "Contributing"
3033
chapters:
3134
- file: contributing/a_new_problem.md
3235
- file: contributing/a_new_solver.md
36+
- caption: "Bibliography"
37+
chapters:
38+
- file: bibliography.md
3339
- caption: "Appendix: Understanding foldx"
3440
chapters:
3541
- file: understanding_foldx/00-installing-foldx.md

docs/protein-optimization/index.md

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,10 +48,16 @@ Computing the QED using `RDKit`.
4848
Computing the log-quotient of solubilities using `RDKit`.
4949
:::
5050

51-
:::{grid-item-card} TDC oracles [WIP]
52-
:link: ./using_poli/objective_repository/tdc_oracles.html
51+
:::{grid-item-card} Penalized Log-solubility (LogP, using `lambo`)
52+
:link: ./using_poli/objective_repository/penalized_logp_lambo.html
5353
:columns: 6
54-
Some of the oracles provided by the Therapeutics Data Commons. [WIP]
54+
Computing the penalized log-quotient of solubilities using `lambo`'s implementation.
55+
:::
56+
57+
:::{grid-item-card} DDR3 (or 3pbl) docking (using `tdc`)
58+
:link: ./using_poli/objective_repository/ddr3_docking.html
59+
:columns: 6
60+
A wrapper around the Therapeutics Data Commons implmenetation of `3pbl` docking.
5561
:::
5662

5763
::::
@@ -153,3 +159,11 @@ How to contribute a new black-box optimization algorithm.
153159

154160

155161
::::
162+
163+
164+
## References
165+
166+
:::{bibliography}
167+
:style: alpha
168+
169+
:::

docs/protein-optimization/references.bib

Lines changed: 25 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,30 @@
22
---
33
44
@article{stanton2022accelerating,
5-
title={Accelerating Bayesian Optimization for Biological Sequence Design with Denoising Autoencoders},
6-
author={Stanton, Samuel and Maddox, Wesley and Gruver, Nate and Maffettone, Phillip and Delaney, Emily and Greenside, Peyton and Wilson, Andrew Gordon},
7-
journal={arXiv preprint arXiv:2203.12742},
8-
year={2022}
5+
title = {Accelerating Bayesian Optimization for Biological Sequence Design with Denoising Autoencoders},
6+
author = {Stanton, Samuel and Maddox, Wesley and Gruver, Nate and Maffettone, Phillip and Delaney, Emily and Greenside, Peyton and Wilson, Andrew Gordon},
7+
journal = {arXiv preprint arXiv:2203.12742},
8+
year = {2022}
99
}
1010

11-
@article{ShrakeRupley:SASA:1973, title={Environment and exposure to solvent of protein atoms. Lysozyme and insulin}, volume={79}, ISSN={00222836}, DOI={10.1016/0022-2836(73)90011-9}, number={2}, journal={Journal of Molecular Biology}, author={Shrake, A. and Rupley, J.A.}, year={1973}, month={Sep}, pages={351–371}, language={en} }
11+
@article{ShrakeRupley:SASA:1973,
12+
title = {Environment and exposure to solvent of protein atoms. Lysozyme and insulin},
13+
volume = {79},
14+
issn = {00222836},
15+
doi = {10.1016/0022-2836(73)90011-9},
16+
number = {2},
17+
journal = {Journal of Molecular Biology},
18+
author = {Shrake, A. and Rupley, J.A.},
19+
year = {1973},
20+
month = {Sep},
21+
pages = {351–371},
22+
language = {en}
23+
}
24+
25+
@inproceedings{huang:TDC:2021,
26+
title = {Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development},
27+
author = {Kexin Huang and Tianfan Fu and Wenhao Gao and Yue Zhao and Yusuf H Roohani and Jure Leskovec and Connor W. Coley and Cao Xiao and Jimeng Sun and Marinka Zitnik},
28+
booktitle = {Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},
29+
year = {2021},
30+
url = {https://openreview.net/forum?id=8nvgnORnoWr}
31+
}

docs/protein-optimization/requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
docutils==0.17.1
12
jupyter-book
23
matplotlib
34
numpy

docs/protein-optimization/understanding_foldx/01-single-mutation-using-foldx/index.ipynb

Lines changed: 0 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -864,23 +864,6 @@
864864
"\n",
865865
"These are quantities we want to optimize: the lesser the energy, the stabler a protein might be, and higher SASA correlates with e.g. length of fluorescence in certain proteins. Indeed, these are the quantities described and optimized one of the tasks presented in [LaMBO](https://github.com/samuelstanton/lambo) {cite:p}`stanton2022accelerating`."
866866
]
867-
},
868-
{
869-
"cell_type": "markdown",
870-
"metadata": {},
871-
"source": [
872-
"## Bibliography"
873-
]
874-
},
875-
{
876-
"cell_type": "markdown",
877-
"metadata": {},
878-
"source": [
879-
":::{bibliography}\n",
880-
":style: alpha\n",
881-
"\n",
882-
":::"
883-
]
884867
}
885868
],
886869
"metadata": {

docs/protein-optimization/using_poli/objective_repository/all_objectives.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,18 @@ Computing the QED using `RDKit`.
4141
Computing the log-quotient of solubilities using `RDKit`.
4242
:::
4343

44+
:::{grid-item-card} Penalized Log-solubility (LogP, using `lambo`)
45+
:link: ./using_poli/objective_repository/penalized_logp_lambo.html
46+
:columns: 6
47+
Computing the penalized log-quotient of solubilities using `lambo`'s implementation.
48+
:::
49+
50+
:::{grid-item-card} DDR3 (or 3pbl) docking (using `tdc`)
51+
:link: ./using_poli/objective_repository/ddr3_docking.html
52+
:columns: 6
53+
A wrapper around the Therapeutics Data Commons implmenetation of 3pbl docking.
54+
:::
55+
4456
:::{grid-item-card} TDC oracles [WIP]
4557
:link: ./using_poli/objective_repository/tdc_oracles.html
4658
:columns: 6

docs/protein-optimization/using_poli/objective_repository/ddr3_docking.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# DDR3 docking (using TDC)
22

33
![Type of objective function: discrete](https://img.shields.io/badge/Type-discrete_inputs-blue)
4-
![Environment to run this objective function: poli lambo](https://img.shields.io/badge/Environment-poli____base-teal
4+
![Environment to run this objective function: poli lambo](https://img.shields.io/badge/Environment-poli____lambo-teal
55
)
66

77
## About
88

9-
This objective function computes the docking score of a small molecule w.r.t. the protein `3pbl`, [which is the canonical example in the Therapeutics Data Common's docking oracles](https://tdcommons.ai/functions/oracles/#docking-scores). Under the hood, it uses pyscreener, vina and the ADFR suite.
9+
This objective function computes the docking score of a small molecule w.r.t. the protein `3pbl`, [which is the canonical example in the Therapeutics Data Common's docking oracles](https://tdcommons.ai/functions/oracles/#docking-scores) {cite:p}`huang:TDC:2021`. Under the hood, it uses pyscreener, vina and the ADFR suite.
1010

1111
## Prerequisites
1212

@@ -113,6 +113,9 @@ print(y0) # [[-4.1]]
113113

114114
::::
115115

116-
## See also
116+
<!-- ## References
117117
118-
- [an internal link of sorts]()
118+
:::{bibliography}
119+
:style: alpha
120+
121+
::: -->

docs/protein-optimization/using_poli/objective_repository/penalized_logp_lambo.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
1-
# Objective function name
1+
# Penalized logP (using `lambo`)
22

33
![Type of objective function: discrete](https://img.shields.io/badge/Type-discrete_inputs-blue)
44
![Environment to run this objective function: poli lambo](https://img.shields.io/badge/Environment-poli____lambo-teal
55
)
66

77
## About
88

9-
This objective function computes the penalized logP _exactly_ as is done in the `lambo` implementation.[^1]
9+
This objective function computes the penalized logP _exactly_ as is done in the `lambo` implementation[^1] {cite:p}`stanton2022accelerating`.
1010

1111
[^1]: If you check carefully, you might have noticed that they add to their implementation the empirical means and standard deviations of the ZINC dataset for the values they compute.
1212

@@ -77,3 +77,10 @@ f.terminate()
7777

7878
- `penalized: bool = True`. Whether we are evaluating penalized logP or not.
7979
- `string_representation: str = "SMILES"`. Can be either `"SMILES"` or `"SELFIES"`.
80+
81+
<!-- ## References
82+
83+
:::{bibliography}
84+
:style: alpha
85+
86+
::: -->

0 commit comments

Comments
 (0)