apache-sparkamazon-s3palantir-foundryfoundry-data-connection

How to integrate Palantir Foundry with Amazon S3 or HDFS


Within Palantir Foundry platform, I am working in Data integration. I need some help as I am new to Palantir software. Is there any documents, white-papers, links or tutorials on this topic?

How do I integrate data from another source, for example Amazon S3 or HDFS?


Solution

  • To integrate data from another Platform you'll need a source and a sync in data connection. You'll need to have platform permissions to create these, not all users can since it can involve the organisation data governance policies.

    Assuming you don't have a source with a valid configuration for S3. You'll need to create one. On Data Connection, click "Sources" and then click "New Source". You can then do this in two ways:

    For the magritte-rest:

    type: magritte-rest
    url: 'https://foobar.organization.s3.amazonaws.com'
    

    Now to create the Sync, use a configuration similar to this:

    type: rest-source-adapter
    method: GET
    path: the/path/in/s3/yourdata
    outputFileType: csv
    

    Other output file types are also supported (json, zip, ...)