I' writing this message to ask help in order to find a solution for this python code. I'm trying to generate a visualization of FAA PMA Parts in comparison with my company's parts database. I've already done it using Power Bi, SQL Query and Excel files, however it is not an online visualization. Every month i have to download FAA PMA Parts from URL (https://rgl.faa.gov/Regulatory_and_Guidance_Library/rgPMA.nsf/f6a80b2e3c91131686257bcf0063a042/bc1b3424cc899bee86257beb006237ae/$FILE/PMA.zip) in order to update this visualization.
That being said, i've tried to do these python codes in order to read Ms Access file from the above mentioned Url and convert it to a Pandas DataFrame.
However i didn't succeed in this task for the following reasons:
First i've tried to use sqlalchemy, but the following errors have shown up.
from zipfile import ZipFile
from io import BytesIO
import urllib
from urllib.request import urlopen
from sqlalchemy import create_engine
import pandas as pd
r = urlopen("https://rgl.faa.gov/Regulatory_and_Guidance_Library/rgPMA.nsf/f6a80b2e3c91131686257bcf0063a042/bc1b3424cc899bee86257beb006237ae/$FILE/PMA.zip").read()
file = ZipFile(BytesIO(r))
pma_accdb = file.open("PMA.accdb")
connection_string = (r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};'
r'DBQ=pma_accdb;'
r'ExtendedAnsiSQL=1;'
)
connection_url = f"access+pyodbc:///?odbc_connect={urllib.parse.quote_plus(connection_string)}"
engine = create_engine(connection_url)
sql_df = pd.read_sql("SELECT * FROM Parts", con=engine)
print(sql_df.head())
Error:Traceback (most recent call last): File "C:/Users/thiago.ribeiro/.PyCharmCE2019.1/config/scratches/scratch.py", line 17, in engine = create_engine(connection_url) File "C:\Users\thiago.ribeiro\AppData\Local\Programs\Python\Python38\lib\site-packages\sqlalchemy\engine_init_.py", line 500, in create_engine return strategy.create(*args, **kwargs) File "C:\Users\thiago.ribeiro\AppData\Local\Programs\Python\Python38\lib\site-packages\sqlalchemy\engine\strategies.py", line 61, in create entrypoint = u._get_entrypoint() File "C:\Users\thiago.ribeiro\AppData\Local\Programs\Python\Python38\lib\site-packages\sqlalchemy\engine\url.py", line 172, in _get_entrypoint cls = registry.load(name) File "C:\Users\thiago.ribeiro\AppData\Local\Programs\Python\Python38\lib\site-packages\sqlalchemy\util\langhelpers.py", line 267, in load raise exc.NoSuchModuleError( sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:access.pyodbc
After that i've tried to use pyodbc, but the following errors have also shown up.
from zipfile import ZipFile
from io import BytesIO
from urllib.request import urlopen
import pyodbc
import pandas as pd
r = urlopen("https://rgl.faa.gov/Regulatory_and_Guidance_Library/rgPMA.nsf/f6a80b2e3c91131686257bcf0063a042/bc1b3424cc899bee86257beb006237ae/$FILE/PMA.zip").read()
file = ZipFile(BytesIO(r))
pma_accdb = file.open("PMA.accdb")
driver = '{Microsoft Access Driver (*.mdb, *.accdb)}'
filepath = pma_accdb
myDataSources = pyodbc.dataSources()
access_driver = myDataSources['MS Access Database']
cnxn = pyodbc.connect(driver=access_driver, dbq=filepath, autocommit=True)
crsr = cnxn.cursor()
crsr.execute("SELECT * FROM Parts")
sql_df = pd.read_sql("SELECT * FROM Parts", con=engine)
print(sql_df.head())
Error: Traceback (most recent call last): line 17, in cnxn = pyodbc.connect(driver=access_driver, dbq=filepath, autocommit=True) pyodbc.Error: ('HY000', '[HY000] [Microsoft][Driver ODBC Microsoft Access] general error unable to open registry key Temporary (volatile) Ace DSN for process 0x2efc Thread 0x4a20 DBC 0xd39ea788
The above being said, could you please help me understand where my code is wrong and how can i improve it in order to read MS Access from url and convert it to a pandas data frame?
I've got a help from a friend. this was de solution:
from zipfile import ZipFile
from io import BytesIO
import urllib
from urllib.request import urlopen
from sqlalchemy import create_engine
import pandas as pd
resp = urlopen("https://rgl.faa.gov/Regulatory_and_Guidance_Library/rgPMA.nsf/f6a80b2e3c91131686257bcf0063a042/bc1b3424cc899bee86257beb006237ae/$FILE/PMA.zip").read()
zipfile = ZipFile(BytesIO(resp))
connection_string = (r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};'
r'DBQ=' + zipfile.extract('PMA.accdb') + ';'
r'ExtendedAnsiSQL=1;'
)
connection_url = f"access+pyodbc:///?odbc_connect={urllib.parse.quote_plus(connection_string)}"
engine = create_engine(connection_url)
sql_df = pd.read_sql("SELECT * FROM Parts", con=engine)