I am trying to simplify my code in Palantir. My idea is to use for cycle with f-string to simplify the process instead of writing X-times the same code with different numbers of output/source_df.
This is my code:
@transform(
output1=Output("output1"),
output2=Output("output2"),
source_df1=Input("source1"),
source_df2=Input("source2")
)
def compute(output1, output2, source_df1, source_df2):
for i in range(1, 1, 3):
f"output{i}".write_dataframe(f"source_df{i}".dataframe().filter(
f"source_df{i}".dataframe().SNAPSHOT_DATE >= "2024-09-18").coalesce(1),
output_format="csv", options={
'compression': 'gzip', 'header': "True"
})
This leads to:
ValueError: Parameter 'inputs' of transform Transform(myproject.datasets.list_of_tables:compute)<> is not a transforms.api.Param
What you are looking for are transforms generators
.
https://community.palantir.com/t/transforms-generator-providing-additional-parameter-for-every-transformation/581/2
Your code will then look like:
@transform(
** inputs_and_outputs
)
def my_transform(ctx, **all_datasets):
# Do what you want on those datasets, iterate over them, etc.