pythonrdkit

ArgumentError: Python argument types in rdkit.Chem.rdMolDescriptors.GetMorganFingerprintAsBitVect(NoneType, int) did not match C++ signature


So I'm working with RDKit and Python to convert SMILES strings to ECFP4 fingerprints, and my code is as shown below. I got an error, but I have also checked with this question over here but I seem to have the correct code? But why am I still getting an error?

Is there an alternative way to code this?

bits = 1024
PandasTools.AddMoleculeColumnToFrame(data, smilesCol='SMILES')
data_ECFP4 = [AllChem.GetMorganFingerprintAsBitVect(x, 3, nBits = bits) for x in data['ROMol']]
data_ecfp4_lists = [list(l) for l in data_ECFP4]
ecfp4_name = [f'B{i+1}' for i in range(1024)]
data_ecfp4_df = pd.DataFrame(data_ecfp4_lists, index = data.TARGET, columns = ecfp4_name)

The error I got is:

ArgumentError: Python argument types in rdkit.Chem.rdMolDescriptors.GetMorganFingerprintAsBitVect(NoneType, int) did not match C++ signature: GetMorganFingerprintAsBitVect(class RDKit::ROMol mol, int radius, unsigned int nBits=2048, class boost::python::api::object invariants=[], class boost::python::api::object fromAtoms=[], bool useChirality=False, bool useBondTypes=True, bool useFeatures=False, class boost::python::api::object bitInfo=None, bool includeRedundantEnvironments=False)


Solution

  • import pandas as pd
    from rdkit import Chem
    from rdkit.Chem import PandasTools
    df = pd.read_csv('file.csv')
    PandasTools.AddMoleculeColumnToFrame(df, "SMILES")
    df = df[~df['ROMol'].isnull()]
    df.to_csv('new_file.csv')