palantir-foundryfoundry-code-repositoriesfoundry-data-connection

how to apply schema of one dataframe to another empty dataframe using rest call


I have two datasets in foundry : df1 & df2, df1 has data with a schema.

the df2 is the empty dataframe with no schema applied.

Using data proxy i was able to extract the schema from df1

{
  "foundrySchema": {
    "fieldSchemaList": [
      {...

 }
    ],
    "primaryKey": null,
    "dataFrameReaderClass": "n/a",
    "customMetadata": {}
  },
  "rows": []
}

how can i apply this schema to the empty dataframe df2 via a rest call ?

The below foundry example shows how to commit an empty transaction, this example does not show how to apply the schema

curl -X POST \
  -H "Authorization: Bearer ${TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{}' \
  "${CATALOG_URL}/api/catalog/datasets/${DATASET_RID}/transactions/${TRANSACTION_RID}/commit"

Solution

  • Here is a Python function to upload a schema for a dataset with a committed transaction:

    from urllib.parse import quote_plus
    import requests
    
    
    def upload_dataset_schema(dataset_rid: str,
                              transaction_rid: str, schema: dict, token: str, branch='master'):
        """
        Uploads the foundry dataset schema for a dataset, transaction, branch combination
        Args:
            dataset_rid: The rid of the dataset
            transaction_rid: The rid of the transaction
            schema: The foundry schema
            branch: The branch
    
        Returns: None
    
        """
        base_url = "https://foundry-instance/foundry-metadata/api"
        response = requests.post(f"{base_url}/schemas/datasets/"
                                 f"{dataset_rid}/branches/{quote_plus(branch)}",
                                 params={'endTransactionRid': transaction_rid},
                                 json=schema,
                                 headers={
                                     'content-type': "application/json",
                                     'authorization': f"Bearer {token}",
                                 }
                                 )
        response.raise_for_status()