postgresqlscikit-learnpersistence

Save classifier to postrgesql database, in scikit-learn


I know that scikit-learn models can be persisted in files, by using joblib (as described here: http://scikit-learn.org/stable/modules/model_persistence.html). However, since i have machine learning procedure inside postgresql plpythonu function, I would rather persist the model inside the Postgresql database.

What is recommended, the most convinient way to store scikit-learn model inside a Postgresql database?


Solution

  • Here is a sample code in Python for sending the trained model to a Postgres table. Note that you first need to create a table that has a column with the "bytea" type to store the pickled sklearn model in binary format.

    from sklearn import svm
        
    import psycopg2
    import pickle
        
    #### # Connect to Postgres
        
    connection = psycopg2.connect(user, password, host, port, database)
    cur = connection.cursor()
    model = svm.OneClassSVM()
    model.fit(features)   # features are some training data
    data = pickle.dumps(model)    # first we should pickle the model
        
    #### # Assuming you have a Postgres table with columns epoch and file
    
    sql = "INSERT INTO sampletable (epoch, file) VALUES(%s)"
    cur.execute(sql, (epoch, psycopg2.Binary(data)))
    connection.commit()