I have alkene molecules of formula C9H17B. How can I separate these molecules into three classes, one being the class that has C-B-H2, one that has C2-B-H and one that has C3-B. How would I do this? I've tried using smiles and also as mol but my approaches aren't working.
To find specific substructures use SMARTS.
https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html
If I see it correctly these are the three types of boron you are looking for.
from rdkit import Chem
from rdkit.Chem import Draw
smiles = ['CCB', 'CCBC', 'CCB(C)(C)']
mols = [Chem.MolFromSmiles(s) for s in smiles]
Draw.MolsToGridImage(mols)
Write SMARTS for boron with three connections BX3
and number of hydrogen H2
, H1
, H0
.
smarts = ['[BX3;H2]', '[BX3;H1]', '[BX3;H0]']
patts = [Chem.MolFromSmarts(s) for s in smarts]
Now you can proof for substructure in each molecule.
for p in patts:
for m in mols:
print(m.HasSubstructMatch(p))
print()
True
False
False
False
True
False
False
False
True