talend

How to apply a row limit based on percentage filtration in Talend?


I am using Talend's tDBInput component to read data from a SQL table, followed by a tMap component to filter the rows. The number of rows that pass through the tMap varies depending on the data in the SQL table.

Scenario:

Example:

What I Have Tried:

I learned about the global variable NB_LINE, which can be used to get the number of rows written to the component's file or table. However, I am unsure how to implement this to achieve the desired flow for my scenario.

Questions:

  1. How can I use the NB_LINE variable to determine the percentage of filtered rows?
  2. What components or logic should I use to implement the conditional flow based on the percentage of rows?

Any guidance on how to set up this flow in Talend would be greatly appreciated!


Solution

  • global variable is the way to go I think, but this variable is only accessible on "AFTER" mode, meaning that you have to exit subjob to get the number of line processed.

    useful components in this case are the pair tHashInput/tHashOutput (hidden by default but you can activate them through the "palette" menu in project settings). With them you are able to use a cache for your data, and separate in multiple subjobs.

    enter image description here