The zipfile module is very interesting to manage .zip files with python.
However if the .zip file has been created on a linux system or macos the separator is of course '/' and if we try to work with this file on a Windows system there can be a problem because the separator is '\'. So, for example, if we try to determine the directory root compressed in the .zip file we can think to something like:
from zipfile import ZipFile, is_zipfile
import os
if is_zipfile(filename):
with ZipFile(filename, 'r') as zip_ref:
packages_name = [member.split(os.sep)[0] for member in zip_ref.namelist()
if (len(member.split(os.sep)) == 2 and not
member.split(os.sep)[-1])]
But in this case, we always get packet_name = [] because os.sep is "\" whereas since the compression was done on a linux system the paths are rather 'foo1/foo2'.
In order to manage all cases (compression on a linux system and use on a Windows system or the opposite), I want to use:
from zipfile import ZipFile, is_zipfile
import os
if is_zipfile(filename):
with ZipFile(filename, 'r') as zip_ref:
if all([True if '/' in el else
False for el in zip_ref.namelist()]):
packages_name = [member.split('/')[0] for member in zip_ref.namelist()
if (len(member.split('/')) == 2 and not
member.split('/')[-1])]
else:
packages_name = [member.split('\\')[0] for member in zip_ref.namelist()
if (len(member.split('\\')) == 2 and not
member.split('\\')[-1])]
What do you think of this? Is there a more direct or more pythonic way to do the job?
Thanks to @snakecharmerb answer and to the reading of the link he proposed, I have just understood. Thank you @snakecharmerb for showing me the way ... In fact, indeed as described in the link proposed, internally zipfile uses only '/' and this independently of the OS used. As I like to see things concretely I just did this little test:
On a Windows OS I created with the usual means of this OS (not in command line) a file testZipWindows.zip containing this tree structure:
I did the same thing on a linux OS (and without also using a command line) for the testZipFedora.zip archive:
This is the result:
$ python3
Python 3.7.9 (default, Aug 19 2020, 17:05:11)
[GCC 9.3.1 20200408 (Red Hat 9.3.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from zipfile import ZipFile
>>> with ZipFile('/home/servoz/Desktop/test/testZipWindows.zip', 'r') as WinZip:
... WinZip.namelist()
...
['testZipWindows/', 'testZipWindows/foo1.txt', 'testZipWindows/InFolder/', 'testZipWindows/InFolder/foo2.txt']
>>> with ZipFile('/home/servoz/Desktop/test/testZipFedora.zip', 'r') as fedZip:
... fedZip.namelist()
...
['testZipFedora/', 'testZipFedora/foo1.txt', 'testZipFedora/InFolder/', 'testZipFedora/InFolder/foo2.txt']
So it all lights up! We must indeed use os.path.sep to work properly in multiplatform but when we deals with zipfile library it is absolutely necessary to use '/' as separator and not os.sep (or os.path.sep). That was my mistake !!!
So the code to use in a multiplatform way for the example of my first post is just:
from zipfile import ZipFile, is_zipfile
import os
if is_zipfile(filename):
with ZipFile(filename, 'r') as zip_ref:
packages_name = [member.split('/')[0] for member in zip_ref.namelist()
if (len(member.split('/')) == 2 and not
member.split('/')[-1])]
And not all the useless things I had imagined...