etlpentahokettle

How to find unique values from set of rows using pentaho kettle?


I have one de normalized table. I want to select all the values from one specific column of that table and load only unique values from that column into separate table.

How to do this using Pentaho Spoon? Please note that I am totally newbie to Spoon. I have tried only hello world transformation in my life.

I have table named 'Employees' which has lots of columns as follows (I have not given unrelated columns here):

+-------------------------------------------------------+

                           Employees
+-------------------------------------------------------+

employee_number | employee_name | deputed_branch | phone

+-------------------------------------------------------+

Now I want to move only unique branch names into new table named branches using Spoon.

'branches' table will look like following :

+-------------------------------------------------------+

                           branches
+-------------------------------------------------------+

| branch_id | branch_name 

+-------------------------------------------------------+

where branch_id will be unique and auto incremented.

To connect Employees and branches table I will use Employee_branch table which will consist of employee_number and branch_id column.

Can anyone please tell how to do this?

Thanks in advance !!


Solution

  • can't you just do that in the sql?

    select distinct deputed_branch from employees

    If not; Then use either the unique rows step ( not that it has to be sorted data ) or the group by step. ( also sorted )

    or; Memory group by if number of rows is low ( data doesnt need to be sorted )