pythonpandasdataframe

Dtypes Data Frame Assigns nvarchar(MAX) by default


So I have this snippet. When the code runs to insert into a new table, it declares all columns as nvarchar(max). Clearly, this is undesirable behavior. My question is, is there a way to define a length here? So that it isn't MAX?

I know I have two options from my research, which are:

  1. Use a dict to pre-define all columns with appropriate data types.
  2. Maintain the staging table and Append as opposed to replace. This of course requires a truncate first.

Is there a way to do something like this dtype=NVARCHAR(100)? Or is there some other option I haven't thought of yet?

data.to_sql
    (
        name=f'{table_name}'
        , schema='stage'
        , con=con
        , if_exists='replace'
        , index=False
        , dtype=NVARCHAR
    )

Solution

  • you can pass the the dtype and define the length like this:

    import sqlalchemy
    
    data.to_sql
        (
            name=f'{table_name}'
            , schema='stage'
            , con=con
            , if_exists='replace'
            , index=False
            , dtype=sqlalchemy.types.NVARCHAR(length=100)
        )