I have multiple zip files containing different types of txt files. Like below:
zip1
- file1.txt
- file2.txt
- file3.txt
How can I use pandas to read in each of those files without extracting them?
I know if they were 1 file per zip I could use the compression method with read_csv like below:
df = pd.read_csv(textfile.zip, compression='zip')
Any help on how to do this would be great.
You can pass ZipFile.open()
to pandas.read_csv()
to construct a pandas.DataFrame
from a csv-file packed into a multi-file zip
.
pd.read_csv(zip_file.open('file3.txt'))
.csv
into a dict:from zipfile import ZipFile
zip_file = ZipFile('textfile.zip')
dfs = {text_file.filename: pd.read_csv(zip_file.open(text_file.filename))
for text_file in zip_file.infolist()
if text_file.filename.endswith('.csv')}