Click with Path type behave differently once imported

Let's imagine I have a simple Python script do_stuff.py that lists subfolders in a given folder:

import click

@click.command(help="Do stuff.")
@click.argument('datasets', type=click.Path(exists=True, readable=True, writable=True), nargs=-1)
def main(datasets):
    for dataset in datasets:
        print(dataset)

if __name__ == "__main__":
    main()

In my case it returns the expected list of folders when I run python3 do_stuff.py ./s3_data/lsc2/landsat_ot_c2_l2/*:

./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220121
./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220206
./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220222
./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220310

When I try to do the same thing from another script master.py, importing do_stuff.py:

from do_stuff import main as ds

ds('./s3_data/lsc2/landsat_ot_c2_l2/*')

When I run python3 master.py it returns:

Usage: master.py [OPTIONS] [DATASETS]...
Try 'master.py --help' for help.

Error: Invalid value for '[DATASETS]...': Path '/' is not writable.

If I modify the last line of master.py into ds(['./s3_data/lsc2/landsat_ot_c2_l2/*']), then I get:

Usage: master.py [OPTIONS] [DATASETS]...
Try 'master.py --help' for help.

Error: Invalid value for '[DATASETS]...': Path './s3_data/lsc2/landsat_ot_c2_l2/*' does not exist.

Thanks in advance for any help you could provide.

Solution

In order for you to fully understand the behavior that looks different that you misattributed as "Click with Path type behavior differently once imported", may I recommend the following experiment. In your do_stuff.py, check what arguments click actually received when from the command line './s3_data/lsc2/landsat_ot_c2_l2/*' was passed. This may be done by adding these couple lines before main()

if __name__ == "__main__":
    import sys
    print(sys.argv)
    main()

Now run it:

$ python do_stuff.py ./s3_data/lsc2/landsat_ot_c2_l2/*
['do_stuff.py', './s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220121', './s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220206', './s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220222', './s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220310']
./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220121
./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220206
./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220222
./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220310

Note how it's actually a list of the four files that you have, and the * is nowhere to be seen inside the program. This is because from the shell, it automatically expands the wildcard character before the command is actually run, so Python (thus click) will only see the fully qualified filenames and the loop runs happily.

A easy way to emulate the failure from the command line (without having to muck about with set glob as per the linked thread in the previous paragraph) to repeat the failure you saw from calling ds('./s3_data/lsc2/landsat_ot_c2_l2/*') is to pass in a path with a wildcard that will not be resolved to any file, example:

$ python do_stuff.py /tmp/a_dir_does_not_exist/*
Usage: do_stuff.py [OPTIONS] [DATASETS]...
Try 'do_stuff.py --help' for help.

Error: Invalid value for '[DATASETS]...': Path '/tmp/a_dir_does_not_exist/*' does not exist.

Since the shell cannot expand that wildcard argument, its unmodifed form is passed as an argument to the program, and in the example it emulates calling ds(['/tmp/a_dir_does_not_exist/*']) as per the failed example in the question.

Now, if you want to use the same glob syntax from within Python, you may use the glob.glob function to replicate the automatic wildcard expansion in the shell, example:

>>> from do_stuff import main as ds
>>> from glob import glob
>>> ds(glob('./s3_data/lsc2/landsat_ot_c2_l2/*'))
./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220310
./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220222
./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220206
./s3_data/lsc2/landsat_ot_c2_l2/LC08_L2SP_195027_20220121