I have two datasets in foundry : df1 & df2, df1 has data with a schema.
the df2 is the empty dataframe with no schema applied.
Using data proxy i was able to extract the schema from df1
{
"foundrySchema": {
"fieldSchemaList": [
{...
}
],
"primaryKey": null,
"dataFrameReaderClass": "n/a",
"customMetadata": {}
},
"rows": []
}
how can i apply this schema to the empty dataframe df2 via a rest call ?
The below foundry example shows how to commit an empty transaction, this example does not show how to apply the schema
curl -X POST \
-H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type: application/json" \
-d '{}' \
"${CATALOG_URL}/api/catalog/datasets/${DATASET_RID}/transactions/${TRANSACTION_RID}/commit"
Here is a Python function to upload a schema for a dataset with a committed transaction:
from urllib.parse import quote_plus
import requests
def upload_dataset_schema(dataset_rid: str,
transaction_rid: str, schema: dict, token: str, branch='master'):
"""
Uploads the foundry dataset schema for a dataset, transaction, branch combination
Args:
dataset_rid: The rid of the dataset
transaction_rid: The rid of the transaction
schema: The foundry schema
branch: The branch
Returns: None
"""
base_url = "https://foundry-instance/foundry-metadata/api"
response = requests.post(f"{base_url}/schemas/datasets/"
f"{dataset_rid}/branches/{quote_plus(branch)}",
params={'endTransactionRid': transaction_rid},
json=schema,
headers={
'content-type': "application/json",
'authorization': f"Bearer {token}",
}
)
response.raise_for_status()