palantir-foundrypalantir-foundry-api

Palantir Foundry - How to load PDF files from Compass folder into code repository transform


In Palantir Foundry, my goal is:

  1. Find all PDFs in a Compass folder
  2. In transform, shutil / copy each PDF from Compass to a dataset file system

I have retrieved a list of PDF files stored in a Compass folder from this endpoint (compass/api/folders/{compass_rid}/children), and also successfully set up a Compass File Lister. I'm stuck on where to go from either option, as I haven't figured out how to use any of the information to actually read a blobster file from a transform.

Is it possible to read these PDFs in a transform to be able to copy them to an unstructured dataset file system?

Based on other SO questions, I read through read files in a repository but this seems to rely on the files actually being imported to the repository, so I'm not following if this would help me.

I also read through the Compass endpoints but I don't see a way to move/copy files from Compass to a dataset filesystem, only potentially from one Compass folder to another.


Solution

  • One of my colleagues was able to get this working, posting here in case anyone else can use it:

    from transforms.api import transform, Input, Output
    import requests
    
    
     @transform(
         pdf=Output("OUT_PATH"),
     )
     def compute(pdf):
    
    cookies = {
        'PALANTIR_TOKEN': 'AUTH_TOKEN',
    }
    
    response = requests.get(
        'https://your-foundry-instance/blobster/api/salt/ri.blobster.main.pdf.0466bc73-fe31-40a2-89d0-536ecff719a3',
        cookies=cookies,
    )
    
    with pdf.filesystem().open('matching.pdf', 'wb') as f:
        f.write(response.content)
        f.close()