I have excel file which has three sheets, using duckdb how to read all sheets into one dataframe?
Normally i'm using spatial extension to read excel files with one sheet and it works perfect, here my code for reading excel.
import duckdb
import polars as pl
# Create a connection to DuckDB
conn = duckdb.connect()
# Install and load the spatial extension
conn.execute("INSTALL spatial;")
conn.execute("LOAD spatial;")
result = conn.execute("""
SELECT * FROM st_read('AccountNumber.xlsx',open_options = ['HEADERS=FORCE']);
""").pl()
result
If you know sheet names you can pass it as layer
parameter and use union all
to combine the sheets (I've added sheet name to result for clarity):
stmt = [f"""
select '{x}' as Sheet, *
from st_read(
'AccountNumber.xlsx',
open_options = ['HEADERS=FORCE'],
layer='{x}'
)""" for x in ["Sheet1","Sheet2"]]
stmt = " union all ".join(stmt)
duckdb.sql(stmt).pl()
As you use it as a Polars dataframe you can also read it with pl.read_excel()
:
pl.concat([
d.with_columns(pl.lit(n).alias("Sheet"))
for n, d in pl.read_excel('AccountNumber.xlsx', sheet_id=0).items()
])