pandasrdkit

RDKit - Export pandas data frame with mol image


I would like to know whether is it possible to export pandas dataframe with molecular image directly in excel file format?

Thanks in advance,


Solution

  • In RDKit's PandasTools there is the funktion SaveXlsxFromFrame.

    http://www.rdkit.org/Python_Docs/rdkit.Chem.PandasTools-module.html#SaveXlsxFromFrame

    XlsxWriter must be installed.

    import pandas as pd
    from rdkit import Chem
    from rdkit.Chem import PandasTools
    
    smiles = ['c1ccccc1', 'c1ccccc1O', 'c1cc(O)ccc1O']
    df = pd.DataFrame({'ID':['Benzene', 'Phenol', 'Hydroquinone'], 'SMILES':smiles})
    
    df['Mol Image'] = [Chem.MolFromSmiles(s) for s in df['SMILES']]
    
    PandasTools.SaveXlsxFromFrame(df, 'test.xlsx', molCol='Mol Image')