I'm using the dataset library to attempt to back up a postgres database into an sqlite file. The code I'm running goes as follows:
local_db = "sqlite:///backup_file.db"
with dataset.connect(local_db) as save_to:
with dataset.connect(postgres_db) as download_from:
for row in download_from['outlook']:
save_to['outlook'].insert(row)
If I print one row of the table, it looks like this:
OrderedDict([
('id', 4400),
('first_sighting', '2014-08-31'),
('route', None),
('sighted_by', None),
('date', None)
])
However, when I get to the line save_to['outlook'].insert(row)
I get an error with the following stack trace:
Traceback (most recent call last):
File "/home/anton/Development/Python/TTC/backup_db.py", line 25, in <module>
save_to['outlook'].insert(dict(row))
File "/home/anton/.virtualenvs/flexity/lib/python3.6/site-packages/dataset/table.py", line 79, in insert
row = self._sync_columns(row, ensure, types=types)
File "/home/anton/.virtualenvs/flexity/lib/python3.6/site-packages/dataset/table.py", line 278, in _sync_columns
self._sync_table(sync_columns)
File "/home/anton/.virtualenvs/flexity/lib/python3.6/site-packages/dataset/table.py", line 245, in _sync_table
self._table.append_column(column)
File "/home/anton/.virtualenvs/flexity/lib/python3.6/site-packages/sqlalchemy/sql/schema.py", line 681, in append_column
column._set_parent_with_dispatch(self)
File "/home/anton/.virtualenvs/flexity/lib/python3.6/site-packages/sqlalchemy/sql/base.py", line 431, in _set_parent_with_dispatch
self._set_parent(parent)
File "/home/anton/.virtualenvs/flexity/lib/python3.6/site-packages/sqlalchemy/sql/schema.py", line 1344, in _set_parent
self.key, table.fullname))
sqlalchemy.exc.ArgumentError: Trying to redefine primary-key column 'id' as a non-primary-key column on table 'outlook'
Any ideas as to what I'm doing wrong? I've tried this in python 2.7.14 and 3.6.3
I figured it out! So, the trick is that by default the database
library makes tables with an auto-incrementing integer primary-key. But, my data already has an 'id' column. In order to avoid this problem, I should define my table before I try to add lines to it, and define it with no primary key as follows:
with dataset.connect(local_db) as save_to:
with dataset.connect(postgres_db) as download_from:
table_to_save_to = save_to.create_table('outlook', primary_id=False)
for row in download_from['outlook']:
table_to_save_to.insert(row)
By doing .create_table(table_name, primary_key=False)
I can make sure that i can insert my own id values into the table.
I found this solution by reading the docs.