I have created a package allowing a user to write data to either a sqlite or Postgres db. I created a module for connecting to the db and a separate module that provides the writing functionality. In the latter module the write is a straightforward pandas internal function call:
indata.to_sql('pay_' + table, con, if_exists='append', index=False)
Writing to an sqlite db (with connection using 'sqlite3') is successful however when writing to a Postgres db I get the following error:
Traceback (most recent call last):
File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1778, in execute
ps = cache['ps'][key]
KeyError: ("SELECT name FROM sqlite_master WHERE type='table' AND name=?;", ((705, 0, <function Connection.__init__.<locals>.text_out at 0x7fc3205fb510>),))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
cur.execute(*args)
File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 861, in execute
self._c.execute(self, operation, args)
File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1837, in execute
self.handle_messages(cursor)
File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1976, in handle_messages
raise self.error
pg8000.core.ProgrammingError: {'S': 'ERROR', 'V': 'ERROR', 'C': '42P01', 'M': 'relation "sqlite_master" does not exist', 'P': '18', 'F': 'parse_relation.c', 'L': '1180', 'R': 'parserOpenTable'}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py", line 1610, in execute
raise_with_traceback(ex)
File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/compat/__init__.py", line 46, in raise_with_traceback
raise exc.with_traceback(traceback)
File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
cur.execute(*args)
File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 861, in execute
self._c.execute(self, operation, args)
File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1837, in execute
self.handle_messages(cursor)
File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1976, in handle_messages
raise self.error
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': {'S': 'ERROR', 'V': 'ERROR', 'C': '42P01', 'M': 'relation "sqlite_master" does not exist', 'P': '18', 'F': 'parse_relation.c', 'L': '1180', 'R': 'parserOpenTable'}
I traced the error to the following file:
/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py
What seems to be happening is that the '.to_sql' function is configured to try to write to a db named 'sqlite_master' at this point in the 'sql.py' file:
def has_table(self, name, schema=None):
# TODO(wesm): unused?
# escape = _get_valid_sqlite_name
# esc_name = escape(name)
wld = "?"
query = (
"SELECT name FROM sqlite_master " "WHERE type='table' AND name={wld};"
).format(wld=wld)
return len(self.execute(query, [name]).fetchall()) > 0
Looking more closely at the errors you can see that the connection is correctly made to the db but that pandas is looking for an sqlite db:
I know that the db name was one I used several half a year ago when I first started working with sqlite so I'm thinking that somewhere I set a configuration value. So:
Per pandas.DataFrame.to_sql
documentation:
con : sqlalchemy.engine.Engine or sqlite3.Connection
Using SQLAlchemy makes it possible to use any DB supported by that library. Legacy support is provided for sqlite3.Connection objects.
This means only SQLite allows a raw connection for the to_sql
method. All other RDBMs including Postgres must use an SQLAlchemy connection for this method to create structures and append data. Do note: read_sql
does not require SQLAlchemy since it does not make persistent changes.
Therefore, this raw DB-API connection cannot work:
import psycopg2
con = psycopg2.connect(host="localhost", port=5432, dbname="mydb", user="myuser", password="mypwd")
indata.to_sql('pay_' + table, con, if_exists='append', index=False)
However, this SQLAlchemy connection can work:
from sqlalchemy import create_engine
engine = create_engine('postgresql+psycopg2://myuser:mypwd@localhost:5432/mydb')
indata.to_sql('pay_' + table, engine, if_exists='append', index=False)
Better use SQLAlchemy for both databases, here for SQLite:
engine = create_engine("sqlite:///path/to/mydb.db")