pentahopdispoon

Filter rows based on a field and create csvs on the filetred result set


I have a table tb_rawcsvdata: with columns plant, employeenumber, term_dt.

I need to create a csv file per unique plant using pentaho.

What i did was create a transformation which fetches unique plants and puts in a resultset. Transformation1

Then i create transformation2 like below: enter image description here

In Transformation i try to pass the plant as parameter to fetch one result set for each of the unique plants identified and create a csv file per plant(resultset).

enter image description here

while i was able to create one file per plant, the data in it was only plant column with one data as below: enter image description here

what i want is to fetch all the columns from the sql query and output per file.


Solution

  • Two things:

    1. Use a table input step instead of your Sql script step.

    Set it with the same connection and put the same query inside it. You don’t even need the plant field to be a parameter, can use a variable instead.

    1. Even simpler, you don’t need all this: just one transformation.

    Run your query in a table input step without the plant field.

    Use a javascript step, calculator, or somethig similar to create a filename field based on the value of the field plant.

    Send it all to a single text file output and set it up to accept the filename as a field from a previous step.

    Rows with different filename values will be written to different files.

    Caveat: filename must be an absolute path. If not, it’ll be relative to PDI’s installation folder.