[SOLVED] Can I access directories in Palantir and use FOR to get names of all tables inside folder?

I need to simplify process of downloading datasets from Palantir. My idea was to use it like directory on local pc, but the problem is, that when i make codespace to use my own code, it seems like it uses virtual python environment, so i cant access directories outside of the environment, which has the datasets i want to use.

So the process should be from my perspective:

Get into a directory with datasets
Make some kind of FOR cycle based on the logic i need and insert names of files into a list
Download all tables from list

Is there some way to do it?

I tried to access directory with the datasets, but as I am in virtual python environment, I dont know how.

I need to run the script inside Palantir. Right now we download datasets one by one through Palantir UI, but that consumes a lot of time.

If what you want to do (best guess, I +1 the comment below your post that it would be great if you can clarify what is what exactly - datasets, files, etc.) is: I have a lot of files on my local laptop, I need to upload them to Foundry, process them, and this will generate another dataset of lot of files, how can I download them in bulk ?

Then my guess, is:

you create a dataset in Foundry, you can bulk upload to dataset by drag and dropping all your files from your local laptop to the dataset. A dataset is primarily a "set of files" which can be of any type. There is no need to have a schema on a dataset to be processed
You pick the app of your choice (Code Workspace for a jupyter like experience, Code Repo for pro-code, Pipeline Builder for no-code/low-code) - My preference is Code Repo, but Code Workspace is likely a good option as well given it generates small code snippets for you
You process the files one by one. Here is a typical example = https://www.palantir.com/docs/foundry/transforms-python/unstructured-files/ Below is an example that simply "copy paste" content from the input dataset to the output dataset

# @lightweight() # Optional - simply doesn't use spark as not needed
# @incremental() # Optional - only to process the new files on each run
@transform(
    input_files=Input("/PATH/example_incremental_dataset"),
    output=Output("/PATH/example_incremental_lightweight_output"),
)
def compute_downstream(input_files, output):
    fs = input_files.filesystem()
    files = list(fs.ls())  # listing all the files in the dataset
    timestamp = int(time.time())

    logger.warning(f"These are the files that will be processed: {files}")

    for curr_input_file in files:
        with input_files.filesystem().open(curr_input_file.path, "rb") as f1:
            with output.filesystem().open(curr_input_file.path + f"_{timestamp}.txt", "wb") as f2:
                f2.write(f1.read())

Now you want to download the output. This hasn't a first class solution, but you will have a few alternatives depending on what exactly you want to download For example:

You have a tabular dataset: You can download as CSV/EXCEL in the limit of 200k rows or so (top right in the dataset > Actions > Download as CSV).
If files are produced, you can go download them one by one (dataset > Details > Files > Download). See https://www.palantir.com/docs/foundry/code-repositories/prepare-datasets-download/#access-the-file-for-download
You could add some post-processing, to compress all the files into one archive file, which you can download manually

# UNTESTED
# Note: If you want to read the files written on your output, to then save the zip file on your output as well, you will need to add the @incremental() decorator
# which acts a bit like an "advanced" mode where you can read your output - which is useful in that case

import zipfile
import os

def compress_files(file_paths, output_zip):
    with zipfile.ZipFile(output_zip, 'w') as zipf:
        for file in file_paths:
            if os.path.isfile(file):  # Check if file exists
                zipf.write(file, os.path.basename(file))
            else:
                print(f"File {file} does not exist and will be skipped.")
                
# Example usage
files_to_compress = ['file1.txt', 'file2.txt', 'file3.txt']
output_zip_file = 'compressed_files.zip'

compress_files(files_to_compress, output_zip_file)

You could script the "download" by triggering the "download" event programmatically from your laptop, see the docs about Foundry's API = https://www.palantir.com/docs/foundry/api/datasets-resources/files/list-files/ and https://www.palantir.com/docs/foundry/api/datasets-resources/files/get-file-content/

Hope that helps

EDIT: In case you have a dynamic set of files, see https://www.palantir.com/docs/foundry/transforms-python/unstructured-files/

in particular:

file_statuses = list(your_input.filesystem().ls())
# Result: [FileStatus(path='students.csv', size=688, modified=...)]
paths = [f.path for f in file_statuses]
# Result: ['students.csv', ...]