I have one table in the source and I want to sync it into Palantir Foundry by keeping in mind the below points
so which transaction type I can use, do I need to on Incremental, and which query I can write to achieve the above scenario?
I have tried using the Append Transaction type with Incremental on, but due to this configuration, I'm not able to update the old data which is in the foundry using the MODIFIED_DATE column.
Also, I want a better solution than using Snapshot which is not that much good.
Looking at the JDBC documentation (assuming you connect via JDBC) you will need to set the incremental on the MODIFIED_DATE
column, as If your rows are mutable (...), you'll need a column that increases with every mutation of the data (e.g. an update_time column).
The "good" solution depends on what you want to do with the data downstream, namely, if you need to have the history of changes or want to keep "the latest only".
As the documentation suggests When you ingest an updated version of an existing row, the Foundry dataset will still include previous versions of the row (remember, we're using the APPEND transaction type). If you want only the latest version of each row, you will need to use another tool in Foundry, such as Transforms, to clean the data.
You can very well ingest with "Append" and if a row has been "updated" then it will be re-added as a new row. Then you can deduplicate downstream, for instance:
View
(from a folder: Actions > New > View, and you can configure on which column to deduplicate - in your case MODIFIED_DATE
)It is worth noting that Object syncing supports incremental, which means that new rows versions with "conflicting" primary key, will be reflecting in "only the latest version is available as an object".
e.g. in Transaction 1
PK 1 | Object A | Value 1
PK 2 | Object B | Value 2
and then you later receive in Transaction 2:
PK 1 | Object A | Value 10
will result in your Ontology:
PK 1 | Object A | Value 10
PK 2 | Object B | Value 2
Note: You need to deduplicate primary keys within each new transaction, as if you provide a new transaction with PK 1 | Object A | Value 30
AND PK 1 | Object A | Value 20
at the same time, then Foundry has no way to know which row version to keep.
EDIT - Deletion of rows
deleted
boolean column and handle it downstream (which then depends on the exact computation - e.g. if there is an aggregation then you need to subtract or recompute it, etc.). Populating the deleted
column in the first place depends on the source system.