hiveteradatasqoop

How to perform incremental load using sqoop tool


I have my data in Teradata table. I have sqooped that Teradata table data into Hive using sqoop-import command.

But, my Teradata table will get the data on a daily basis. So, there is a need to sqoop the newly added data i.e, incremental data from teradata into Hive table.

How can I achieve this?


Solution

  • If you have a any column similar to row-id/timestamp in your table, then you can use:

    --incremental [mode] --last-value [value] --check-column [col]

    If you have a saved job for this, you can skip --last-value as it will be automatically maintained.

    --incremental [mode] has two modes. lastmodified and append, you can use any one based on your requirement.