pentahoetlkettlepentaho-design-studiopentaho-cde

Unzip a list of files whose path has to be read from a table in Pentaho kettle


I am new to Pentaho kettle and have a requirement where we want to unzip a set of files and the paths to those files are there in a table. I am wondering how to go about it.


Solution

  • This should be your main Job:

    enter image description here

    First transformation connects to your database and extract the paths, after this another Job is called (Unzip) which extracts those files. I'll be more clear, the next is the transformation called "Table input":

    enter image description here

    Use the "Table input" step to connect to your database. When you open it you have to create a new connecion and then put your query in the canvas. (make a query which extract values just from the interested column, not every columns). The step "Copy rows to result" gives the values form the database to the next job.

    The following is the job "Unzip":

    enter image description here

    This job receives the values from the previous transformation and pass tose to the "Unzip file" job entry.

    Things to know:

    1) In the main job double click on the Unzip job icon, go to "advanced" and specify "Copy previous result to parameters" and "Execute for every input row". Of course in the Job specification you have to specify the path of this job.

    2) Also double click on the Unzip job icon, go to parameters and put a parameter named as the value which you extract from the database:

    enter image description here

    3) Enter in the sub-Job (Unzip in my case) and right click, then go to "Job settings" and then to "parameters". Now put the same parameter name as before:

    enter image description here

    4) Remember to set the destination folder of the files and the receving parameters in the "Unzip files" job entry:

    enter image description here