python-3.xpostgresqldatabase-connectionpyodbcpg8000

Need overview explanation of dbase connectivity using python


I have been working with python and postgreql for over a year. I can connect and work with postgres databases by blindly using various libraries. But whenever I change platform (most recently from macOS laptop to remote ubuntu server) I go through a day or so of trying to get libraries working eg. I was using 'pyodbc' in some modules but when I migrated the code to the server I had to switch to 'pg8000' because the modules as they were kept throwing errors.

Can someone explain or point me to a link explaining how python connects to dB's? For example, why do I need a MS ODBC driver for 'pyodbc' to connect to an Azure SQL or postgresql but 'pg8000' seems to need nothing at all to connect to a postgresql? When I move to an Ubuntu environment and install ODBC drivers they show up on root under /etc, and /opt (for MS ODBC) but also in my Conda environment (/anaconda3/envs/) and I don't know which is the correct choice for 'ODBC.ini'?

Like I say, I can get things working but really have no understanding as to why they are working and that means I waste time experimenting every time I deal with a change in environment. I've not yet found an explanation online that covers more than a very specific circumstance eg. 'here's how to install our driver ...' Any help would be appreciated.

Final Update:

Following the responses esp. @Thompson the diagram below seems to be the final interpretation and I have a better idea of where to look for answers. For the record pyodbc, SQLAlchemy and pg8000 have been my tools of choice with no problems except as described in the question.

enter image description here


Solution

  • pyodbc is not actually a driver and doesn't contain one, its a 'module for ODBC databases', so it's more of an interface from python to ODBC driver to some database. That's why to use it you have to have an actual separate driver to connect to. Azure SQL being owned by Microsoft would reasonably require Microsoft's ODBC driver, while Postgres will require a Postgres ODBC driver, etc...

    The ODBC driver manager is platform-specific, while the ODBC driver is database-specific. That would explain why if you are you are changing platforms or databases, you need to change drivers.

    As Adrian noted, you don't need ODBC drivers for postgres, it is more common to use postgres/python drivers (eg: https://wiki.postgresql.org/wiki/Python)

    psycopg2 is an actual PostgresSQL driver. It serves as client from Python to postgres, no intermediary required. That's why you don't need to install anything else when you use it. I haven't used pg8000, but based on this list it's a driver too, so you wan't need anything else.

    EDITED TO ADD: Think of a database as some 'black box' you need to activate, and its drivers as electrical sockets. ODBC driver is a specific type of socket (ODBC is a standard developed by Microsoft). If you are using ODBC plug from python (like pyodbc) to a database, you need to make sure the database has an ODBC socket installed/activated. But your database can have other sockets too, like python-compatible DBAPI that's available on postgres. In that case you use a different direct DBAPI connector, like psycopg2.