python-3.xpandasschema

How to specify flavor of sql in pd.io.sql.get_schema?


I am trying to use pd.io.sql.get_schema to generate a postgres schema from a dataframe.

There is no documentation for pd.io.sql.get_schema but from this(https://github.com/pandas-dev/pandas/issues/9960) link it says that I can specify a flavor of sql.

However this feature seems to be deprecated instead I can specify a engine like postgresql (Generate SQL statements from a Pandas Dataframe). How do I do this?

Here is my code so far:

pd.io.sql.get_schema(df.reset_index(), 'data')

Open to all suggestions for generating schema.


Solution

  • I believe you create a connection to a postgreSQL database using SQLAlchemy and then pass that connection to the con kwarg. For example:

    import numpy as np
    import pandas as pd
    import sqlalchemy
    
    dates = pd.date_range('20130101',periods=6)
    df = pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
    
    url = 'postgresql://USER:PASSWORD@HOST:PORT/DATABASE'
    con = sqlalchemy.create_engine(url, client_encoding='utf8')
    print(pd.io.sql.get_schema(df.reset_index(), 'data', con=con))
    CREATE TABLE data (
            index TIMESTAMP WITHOUT TIME ZONE,
            "A" FLOAT(53),
            "B" FLOAT(53),
            "C" FLOAT(53),
            "D" FLOAT(53)
    )