I would like to list all files recursively in a directory. I currently have a directory structure like this:
src/main.c
src/dir/file1.c
src/another-dir/file2.c
src/another-dir/nested/files/file3.c
I've tried to do the following:
from glob import glob
glob(os.path.join('src','*.c'))
But this will only get be files directly in the src
subfolder, e.g. I get main.c
but I will not get file1.c
, file2.c
etc.
from glob import glob
glob(os.path.join('src','*.c'))
glob(os.path.join('src','*','*.c'))
glob(os.path.join('src','*','*','*.c'))
glob(os.path.join('src','*','*','*','*.c'))
But this is obviously limited and clunky, how can I do this properly?
There are a couple of ways:
pathlib.Path().rglob()
Use pathlib.Path().rglob()
from the pathlib
module, which was introduced in Python 3.5.
from pathlib import Path
for path in Path('src').rglob('*.c'):
print(path.name)
glob.glob()
If you don't want to use pathlib, use glob.glob()
:
from glob import glob
for filename in glob('src/**/*.c', recursive=True):
print(filename)
For cases where matching files beginning with a dot (.
); like files in the current directory or hidden files on Unix based system, use the os.walk()
solution below.
os.walk()
For older Python versions, use os.walk()
to recursively walk a directory and fnmatch.filter()
to match against a simple expression:
import fnmatch
import os
matches = []
for root, dirnames, filenames in os.walk('src'):
for filename in fnmatch.filter(filenames, '*.c'):
matches.append(os.path.join(root, filename))
This version should also be faster depending on how many files you have, as the pathlib module has a bit of overhead over os.walk()
.