Skip to content

Commit 26c6753

Browse files
committed
improve doc
1 parent fd645c8 commit 26c6753

File tree

2 files changed

+125
-96
lines changed

2 files changed

+125
-96
lines changed

README.md

Lines changed: 103 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,26 @@
11
MolecularDiffusion
22
==================
3+
4+
The unified generative‑AI framework that streamline training the 3D molecular diffusion models to their deployment in data-driven computational chemistry pipelines
5+
36
![workflow](./images/overview.png)
4-
A 3D Molecular Generation Framework for Data-driven Molecular Applications.
7+
8+
## Key Features
9+
10+
* **End-to-End 3D Molecular Generation Workflow:** Support training diffusion model, and preditive models, and utilize them for various molecular generation tasks, all within a unified framework.
11+
* **Curriculum learning:** Efficient way for training and fine-tuning 3D molecular diffusion models
12+
* **Guidance Tools:** Generate molecules with specific characteristics:
13+
* **Property-Targeted Generation:** Generate molecules with a target physicochemical or electronic properties (e.g., excitation energy, dipole moment)
14+
* **Inpainting:** Systematically explore structural variants around reference molecules
15+
* **Outpainting:** Extend a molecule by generating new parts.
16+
* **Command-Line Interface:** A user-friendly CLI for training, generation, and prediction.
517

618

719
[![arXiv](https://img.shields.io/badge/PDF-arXiv-blue)](https://www.arxiv.org/abs/XXXXXX)
820
[![Code](https://img.shields.io/badge/Code-GitHub-red)](https://github.com/pregHosh/MolCraftDiffusion)
921
[![Weights](https://img.shields.io/badge/Weights-HuggingFace-yellow)](https://huggingface.co/pregH/MolecularDiffusion)
22+
[![Dataset](https://img.shields.io/badge/Dataset-HuggingFace-yellow)](https://huggingface.co/pregH/MolecularDiffusion)
23+
1024

1125

1226
Installation
@@ -53,7 +67,7 @@ Pre-trained diffusion models are available at [Hugging Face](https://huggingface
5367

5468
There are two ways to run experiments: using the `MolCraftDiff` command-line tool (recommended) or by executing the Python scripts directly.
5569

56-
### `MolCraftDiff` CLI (Recommended)
70+
### 1. `MolCraftDiff` CLI (Recommended)
5771

5872
Make sure you have installed the package in editable mode as described above, and that you run the commands from the root of the project directory.
5973

@@ -88,7 +102,7 @@ To get help for a specific command:
88102

89103
MolCraftDiff train --help
90104

91-
### Direct Script Execution
105+
### 2. Direct Script Execution
92106

93107
You can also execute the scripts in the `scripts/` directory directly.
94108

@@ -109,25 +123,28 @@ where INTERFERENCE is one of the following: `gen_cfg`, `gen_cfggg`, `gen_conditi
109123
python scripts/predict.py
110124

111125

112-
We also have scripts in `scripts/applications/utils/` for tasks such as xtb optimization, converting xyz to rdkit mol, assess the quality of 3D geomtry, etc.
113-
114-
115-
Tutorials
116-
---------
117-
118-
A comprehensive set of tutorials is available in the [`tutorials/`](./tutorials/) directory, covering topics from basic model training to advanced generation techniques.
126+
### 3. Post-processing the Generated 3D Molecules
127+
The `scripts/applications/utils/` directory contains various utilities for post-processing generated 3D molecules, including:
128+
* **XTB Optimization:** Optimize molecular geometries using the GFN-xTB method (`xtb_optimization.py`).
129+
* **XYZ to RDKit Conversion:** Convert XYZ coordinate files to RDKit molecular objects (`xyz2mol.py`).
130+
* **Metric Computation:** Compute various quality and diversity metrics for generated molecules (`compute_metrics.py`).
131+
* **RMSD Calculation:** Calculate Root Mean Square Deviation (RMSD) for structural comparison (`compute_rmsd.py`).
132+
* **Molecular Similarity:** Assess molecular similarity using different algorithms (`compute_similarity.py`).
119133

120134

121135
Visualization
122136
-------------
123137

124138
Generated 3D molecules and their properties can be visualized using the [3DMolViewer](https://github.com/pregHosh/3DMolViewer) package.
125139

140+
We also recommend our in-house and lightweight X11 molecular viewer [V](https://github.com/briling/v) package.
141+
126142

127-
Documentation
128-
------------
143+
Tutorials
144+
---------
145+
146+
A comprehensive set of tutorials is available in the [`tutorials/`](./tutorials/) directory, covering topics from basic model training to advanced generation techniques.
129147

130-
For more information, visit: https://moleculardiffusion.readthedocs.io
131148

132149

133150
Project Structure
@@ -176,77 +193,79 @@ Project Structure
176193
│ └── gradient_guidance
177194
│ ├── scheduler.py
178195
│ └── sf_energy_score.py
179-
├── src
180-
│ └── MolecularDiffusion
181-
│ ├── __init__.py
182-
│ ├── _version.py
183-
│ ├── cli.py
184-
│ ├── callbacks
185-
│ │ ├── __init__.py
186-
│ │ └── train_helper.py
187-
│ ├── core
188-
│ │ ├── __init__.py
189-
│ │ ├── core.py
190-
│ │ ├── engine.py
191-
│ │ ├── logger.py
192-
│ │ └── meter.py
193-
│ ├── data
194-
│ │ ├── __init__.py
195-
│ │ ├── dataloader.py
196-
│ │ ├── dataset.py
197-
│ │ └── component
198-
│ │ ├── __init__.py
199-
│ │ ├── dataset.py
200-
│ │ ├── feature.py
201-
│ │ └── pointcloud.py
202-
│ ├── modules
203-
│ │ ├── layers
204-
│ │ │ ├── common.py
205-
│ │ │ ├── conv.py
206-
│ │ │ └── functional.py
207-
│ │ ├── models
208-
│ │ │ ├── __init__.py
209-
│ │ │ ├── egcl.py
210-
│ │ │ ├── egt.py
211-
│ │ │ ├── en_diffusion.py
212-
│ │ │ └── noisemodel.py
213-
│ │ └── tasks
214-
│ │ ├── __init__.py
215-
│ │ ├── diffusion.py
216-
│ │ ├── metrics.py
217-
│ │ ├── regression.py
218-
│ │ └── task.py
219-
│ ├── runmodes
220-
│ │ ├── __init__.py
221-
│ │ ├── handler.py
222-
│ │ ├── generate
223-
│ │ │ ├── __init__.py
224-
│ │ │ └── tasks_generate.py
225-
│ │ └── train
226-
│ │ ├── __init__.py
227-
│ │ ├── data.py
228-
│ │ ├── eval.py
229-
│ │ ├── logger.py
230-
│ │ ├── tasks_egcl.py
231-
│ │ ├── tasks_egt.py
232-
│ │ └── trainer.py
233-
│ └── utils
234-
│ ├── __init__.py
235-
│ ├── comm.py
236-
│ ├── diffusion_utils.py
237-
│ ├── file.py
238-
│ ├── geom_analyzer.py
239-
│ ├── geom_constant.py
240-
│ ├── geom_constraint.py
241-
│ ├── geom_metrics.py
242-
│ ├── geom_utils.py
243-
│ ├── io.py
244-
│ ├── molgraph_utils.py
245-
│ ├── plot_function.py
246-
│ ├── pretty.py
247-
│ ├── sascore.py
248-
│ ├── smilify.py
249-
│ └── torch.py
196+
└── src
197+
└── MolecularDiffusion
198+
├── __init__.py
199+
├── _version.py
200+
├── cli.py
201+
├── molcraftdiff.py
202+
├── callbacks
203+
│ ├── __init__.py
204+
│ └── train_helper.py
205+
├── core
206+
│ ├── __init__.py
207+
│ ├── core.py
208+
│ ├── engine.py
209+
│ ├── logger.py
210+
│ └── meter.py
211+
├── data
212+
│ ├── __init__.py
213+
│ ├── dataloader.py
214+
│ ├── dataset.py
215+
│ └── component
216+
│ ├── __init__.py
217+
│ ├── dataset.py
218+
│ ├── feature.py
219+
│ └── pointcloud.py
220+
├── modules
221+
│ ├── __init__.py
222+
│ ├── layers
223+
│ │ ├── __init__.py
224+
│ │ ├── common.py
225+
│ │ ├── conv.py
226+
│ │ └── functional.py
227+
│ ├── models
228+
│ │ ├── __init__.py
229+
│ │ ├── egcl.py
230+
│ │ ├── egt.py
231+
│ │ ├── en_diffusion.py
232+
│ │ └── noisemodel.py
233+
│ └── tasks
234+
│ ├── __init__.py
235+
│ ├── diffusion.py
236+
│ ├── metrics.py
237+
│ ├── regression.py
238+
│ └── task.py
239+
├── runmodes
240+
│ ├── __init__.py
241+
│ ├── generate
242+
│ │ ├── __init__.py
243+
│ │ └── tasks_generate.py
244+
│ └── train
245+
│ ├── __init__.py
246+
│ ├── data.py
247+
│ ├── eval.py
248+
│ ├── logger.py
249+
│ ├── tasks_egcl.py
250+
│ ├── tasks_egt.py
251+
│ └── trainer.py
252+
└── utils
253+
├── __init__.py
254+
├── comm.py
255+
├── diffusion_utils.py
256+
├── file.py
257+
├── geom_analyzer.py
258+
├── geom_constant.py
259+
├── geom_constraint.py
260+
├── geom_metrics.py
261+
├── geom_utils.py
262+
├── io.py
263+
├── molgraph_utils.py
264+
├── plot_function.py
265+
├── pretty.py
266+
├── sascore.py
267+
├── smilify.py
268+
└── torch.py
250269
```
251270

252271

pyproject.toml

Lines changed: 22 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -17,18 +17,28 @@ maintainers = [
1717
]
1818

1919
dependencies = [
20-
"torch>=2.0.0",
21-
"numpy>=1.21.0",
22-
"scipy>=1.7.0",
23-
"rdkit-pypi>=2022.9.1",
24-
"networkx>=2.8.0",
25-
"matplotlib>=3.5.0",
26-
"pandas>=1.3.0",
27-
"scikit-learn>=1.0.0",
28-
"tqdm>=4.62.0",
29-
"pyyaml>=6.0",
30-
"omegaconf>=2.3.0",
31-
"fire>=0.5.0",
20+
# "pytorch-cuda=12.1", # Conda-specific package
21+
"openbabel",
22+
"fire",
23+
"decorator",
24+
"numpy==1.26.4",
25+
"scipy",
26+
"rdkit-pypi",
27+
"posebusters",
28+
"networkx",
29+
"matplotlib",
30+
"pandas",
31+
"scikit-learn",
32+
"tqdm",
33+
"pyyaml",
34+
"omegaconf",
35+
"ase",
36+
"morfeus",
37+
"cosymlib",
38+
"morfeus-ml",
39+
"wandb",
40+
"hydra-colorlog",
41+
"rootutils",
3242
]
3343

3444
[project.optional-dependencies]

0 commit comments

Comments
 (0)