xmlxpathpentahokettlespoon

Pentaho Data Integration (Spoon) Import XML with nested elements


I have the following xml data structure and I'm trying to create a transformation in Pentaho that gives the output shown in the linked image. The data has elements nested in elements and I can only seem to set the Loop XPath option to get either the main_components or the sub_components.

<?xml version="1.0" encoding="UTF-8"?>
<components>
    <main_component>
        <name>Engine</name>
        <ref_no>336820-182</ref_no>
        <oem>Ford</oem>
    </main_component>
    <main_component>
        <name>Gearbox</name>
        <ref_no>378912-009</ref_no>
        <oem>GM</oem>
    </main_component>
    <main_component>
        <name>Fuel Tank</name>
        <ref_no>378927</ref_no>
        <oem>GM</oem>
        <sub_component>
            <name>Fuel Pump</name>
            <ref_no>27182A</ref_no>
            <oem>Lucus</oem>
        </sub_component>
            <name>Contents Unit</name>
            <ref_no>1219290</ref_no>
            <oem>Honeywell</oem>
        </sub_component>
    </main_component>
</components>

Required Transformation Output


Solution

  • You need to use "XML Input Stream" and need to write small javascript code to get parent node information. (Here I have set the variable if parent node is available and read those variable if child node is available as parent information of that child)

    //Script here
    var pName=null;
    var pRef=null;
    var pOem=null;
    
    
    if( xml_path_level2 != null){
      pName = getVariable("VName","");
      pRef = getVariable("VRef","");
      pOem = getVariable("VOem","");
    }
    else{
      setVariable( "VName",Name,"s");
      setVariable( "VRef",Ref_no,"s");
      setVariable( "VOem",Oem,"s");
    }

    You can found sample from Here

    enter image description here

    Pleasae let me know if its ok with you.