Skip to content

Failed linear polymerization with Agarose monomer and large, flexible molecules take too long #5

@mattiafelice-palermo

Description

@mattiafelice-palermo

Dear Alejandro,

thank you very much for sharing PySoftK, I found it very useful for easily building polymers.

I have run into some difficulties with some molecules unfortunately, such as Agarose:

from pysoftk.linear_polymer.super_monomer import *
from pysoftk.linear_polymer.linear_polymer import *
from pysoftk.format_printers.format_mol import *

monomer=Chem.MolFromSmiles('OC[C@@H]1O[C@@H](O[C@@H]2[C@H]3CO[C@@H]2[C@H](O)[C@H](Br)O3)[C@@H](O)[C@H](O)[C@@H]1OBr')
AllChem.EmbedMolecule(monomer)
monomer = Chem.AddHs(monomer)
n_monomers:int = 2
polymer=Lp(monomer,"Br",n_monomers,shift=1.0).linear_polymer()
polymer.write("mol", f"/home/mpalermo/tmp/PySoftK/polymer_{n_monomers}.mol") # linear_polymer() outputs an openbabel object

When trying to make a polymer starting from the monomer, you get a BadConformerID error from RDKit. Upon further inspection, I have found this is due to the call to mol.GetConformer() in _copy_mol() from module linear_polymer/linear_polymer.py. I suspect that when no conformers are embedded in the mol object, GetConformer() tries to generate them - I'm not sure using which RDKit method - and fails.

As a workaround, I added an explicit call to generate 1 conformer:

    def _copy_mol(self):
       """Function to replicate super_monomers.

       Parameters
       -----------

       mol : rdkit.Chem.rdchem.Mol
          RDKit Mol object

       Return
       --------

       fragments : `list`
          List of RDKit Mol objects
       """    

       mol=self.mol
       cids = AllChem.EmbedMultipleConfs(mol,1) # <-- HERE - explicit call to generate conforms
       CanonicalizeConformer(mol.GetConformer())
       
       n_copies=self.n_copies
       
       fragments=[mol for _ in range(int(n_copies))]

       return fragments

With this modification, now PySoftK works also with Agarose. Nevertheless, while correctly functioning, it would still take hours to generate the polymer. I identified the source of this in the WeightedRotorSearch() call in function rotor_opt() from module tools/utils_ob.py. I think it may take a very long time for large molecules with lots of rotatable bonds. Commenting this out and adding a call to RDKit make3D() function before running OpenBabel Relaxation and Rotor optimization seems to be returning perfectly reasonable geometries anyway.

Is it ok if I send a pull request so that you can review these modifications and evaluate whether to integrate them into PySoftK?

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions