I am trying to make a function to form ester bonds between two molecules, and I am having a problem which is driving me absolutely nuts.
I have a function which identifies the middle carbon atom of a carboxyl site of a molecule, and it works correctly.
Then I use Chem.RemoveAtom() to remove that atom so that I can add the ester bond there later.
However, it reindexes the atoms in the molecule in a haphazard way.
I need to find a way to either a) prevent RDKit from reindexing my molecule until I can add the bond or b) find some way of mapping the old indices to the new indices.
Here is a working example of the phenomenon:
import rdkit.Chem as Chem
from rdkit.Chem import Draw
TMA = Chem.MolFromSmiles('OC(C1=CC(C(O)=O)=CC(C(O)=O)=C1)=O')
TPA = Chem.MolFromSmiles('O=C(O)C1=CC=C(C(O)=O)C=C1')
mol1 = TMA
carboxyl_idx1 = 1
editable_mol1 = Chem.RWMol(mol1)
editable_mol1.RemoveAtom(carboxyl_idx1)
mol2 = TMA
carboxyl_idx2 = 5
editable_mol2 = Chem.RWMol(mol2)
editable_mol2.RemoveAtom(carboxyl_idx2)
mol3 = TPA
carboxyl_idx3 = 7
editable_mol3 = Chem.RWMol(mol3)
editable_mol3.RemoveAtom(carboxyl_idx3)
mols = [mol1, editable_mol1, mol2, editable_mol2, mol3, editable_mol3]
for mol in mols:
for atom in mol.GetAtoms():
atom.SetProp('atomLabel', str(atom.GetIdx()))
img = Draw.MolsToGridImage(mols, molsPerRow=2, subImgSize=(300, 300))
print(carboxyl_idx1, carboxyl_idx2, carboxyl_idx3)
display(img)
The new bond needs to be added to the site in the benzene ring where the carboxyl group was.
So in the first example, to carboxyl_idx, in the second, to carboxyl_idx-1, and in the third, to carboxyl_idx+1. It's always one of these three, but I need a reliable way of figuring out which index in editable_moli used to be the carboxyl_idx in moli before RemoveAtom reindexed everything.
Please help...
This is not an easy one and I think I figured it out for you. Long story short: using 'molAtomMapNumber' instead of 'atomLabel'
from rdkit import Chem
TMA = Chem.MolFromSmiles('OC(C1=CC(C(O)=O)=CC(C(O)=O)=C1)=O')
TPA = Chem.MolFromSmiles('O=C(O)C1=CC=C(C(O)=O)C=C1')
for i, atom in enumerate(TMA.GetAtoms()):
# For each atom, set the property "molAtomMapNumber" to a custom number, let's say, the index of the atom in the molecule
atom.SetProp("molAtomMapNumber", str(atom.GetIdx()+1))
TMA
for i, atom in enumerate(TPA.GetAtoms()):
# For each atom, set the property "molAtomMapNumber" to a custom number, let's say, the index of the atom in the molecule
atom.SetProp("molAtomMapNumber", str(atom.GetIdx()+1))
TPA
mol1 = TMA
carboxyl_idx1 = 1
editable_mol1 = Chem.RWMol(mol1)
editable_mol1.RemoveAtom(carboxyl_idx1)
editable_mol1
mol2 = TMA
carboxyl_idx2 = 5
editable_mol2 = Chem.RWMol(mol2)
editable_mol2.RemoveAtom(carboxyl_idx2)
editable_mol2
mol3 = TPA
carboxyl_idx3 = 1
editable_mol3 = Chem.RWMol(mol3)
editable_mol3.RemoveAtom(carboxyl_idx3)
editable_mol3
mols=[mol1, editable_mol1, mol2, editable_mol2, mol3, editable_mol3]
img = Draw.MolsToGridImage(mols, molsPerRow=2, subImgSize=(300, 300))
img
Please let me know if there is any questions. Cheers!