
A software package for creating and manipulating graphs of molecular groups.
Explore the docs »
View Demo
·
Report Bug
·
Request Feature
Table of Contents
The fundamental data structure behind this package is based on a port graph, look here for an excellent description of the data structure.
git clone https://github.com/mosdef-hub/Grouper
cd Grouper
conda create -f environment.yml # or mamba/micromamba
conda activate grouper-dev
pip install -e .
python -m pytest Grouper/tests
from Grouper import GroupGraph
gG = GroupGraph()
# Adding nodes
gG.add_node(type = 'nitrogen', pattern = 'N', hubs = [0,0,0], is_smarts=False) # default of is_smarts is False
gG.add_node('nitrogen') # Once the type of the node has been specified we can use it again
gG.add_node(type = '', pattern = '[N]', hubs = [0,0,0], is_smarts=True) # Alternatively we can just use smarts
# Adding edges
gG.add_edge(src = (0,0), dst = (1,0), order=1) # In the format ((nodeID, srcPort), (nodeID, dstPort), bondOrder)
gG.add_edge(src = (1,1), dst = (2,0))
gG.add_edge(src = (2,1), dst = (0,1))
"""
Will make
N
/ \
N - N
"""
smiles = gG.to_smiles()
atomG = gG.to_atomic_graph()
import Grouper
from Grouper import Group, exhaustive_generate
node_defs = set()
# Define out node types that we will use to built our chemistries
node_defs.add(Group('nitrogen', 'N', [0,0,0]))
node_defs.add(Group('carbon', 'C', [0,0,0,0]))
node_defs.add(Group('oxygen', 'O', [0,0]))
node_defs.add(Group('benzene', 'c1ccccc1', [0,1,2,3,4,5]))
# Call method to enumerate possibilities
exhausted_space = exhaustive_generate(
n_nodes = 4,
node_defs = node_defs,
input_file_path = '',
nauty_path = '/path/to/nauty_X_X_X',
num_procs = -1, # -1 utilizes all availible CPUs
)
import Grouper
from Grouper.utils import convert_to_nx
def node_descriptor_generator(node_smiles):
mol = rdkit.Chem.MolFromSmiles(node_smiles)
desc = Descriptors.CalcMolDescriptors(mol)
desc = {k: desc[k] for k in desc.keys() if (not isnan(desc[k]) or desc[k] is not None)}
desc = [v for k,v in desc.items()] # flatten descriptors into single vector
desc = torch.tensor(desc, dtype=torch.float64)
return desc
nxG = convert_to_nx(gG)
max_ports = max(len(n.hubs) for n in node_defs)
data = nxG.to_PyG_Data(node_descriptor_generator, max_ports)
data.y = torch.tensor([rdkit.Chem.Descriptors.MolLogP(rdkit.Chem.MolFromSmiles(d))]) # here we utilize rdkit to estimate logP, but obviously can be generated another way
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the project
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Test modified project (
pytest
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a pull request
Distributed under the MIT License. See LICENSE.txt
for more information.
Kieran Nehil-Puleo - [email protected]
Cal Craven
Project Link: https://github.com/kierannp/Grouper