powerquery

Transform repeatable subelements to columns


First of all I'm very new to power query ... I have similar xml as bellow:

<root>
  <entries>
    <entry>
      <otherData>
        <id>1</id>
      </otherData>
      <images>
        <image>
          <url>http://example.com/1.img</url>
        </image>
        <image>
          <url>http://example.com/1.img</url>
        </image>
        <image>
          <url>http://example.com/1.img</url>
        </image>
      </images>
    </entry>
    <entry>
      <otherData>
        <id>2</id>
      </otherData>
      <images>
        <image>
          <url>http://example.com/4.img</url>
        </image>
        <image>
          <url>http://example.com/5.img</url>
        </image>
      </images>
    </entry>
  </entries>
<root>

what I want is to have a table which consists of id and then dynamic number of columns based on /images/image/url. I don't know if it's possible to add columns dynamically and name them somehow (with number suffix img1 ... imgN)


Solution

  • I think you can first load XML into powerquery, and then keep clicking the "Expand Column" icon:

    enter image description here

    Until your table looks like below:

    enter image description here

    Then you will need to open the advance editor and write your own M code to transform the table into your desired format.

    The code I use is this:

    let
      Source = Xml.Tables(File.Contents("/Users/userName/Documents/test.xml")),
      // below code are generated by clicking "expand column" icon
      #"Expanded Table" = Table.ExpandTableColumn(Source, "Table", {"Name", "Table"}, {"Name.1", "Table.1"}),
      #"Expanded Table.1" = Table.ExpandTableColumn(#"Expanded Table", "Table.1", {"otherData", "images"}, {"otherData", "images"}),
      #"Expanded otherData" = Table.ExpandTableColumn(#"Expanded Table.1", "otherData", {"id"}, {"id"}),
      #"Expanded images" = Table.ExpandTableColumn(#"Expanded otherData", "images", {"image"}, {"image"}),
    
      
      // below codes are written by hand
      GetUrls = Table.AddColumn(#"Expanded images", "urls", (row)=> row[image][url]),
      GetHeaders = Table.AddColumn(GetUrls, "headers", (row)=>List.Transform({1..List.Count(row[urls])}, (x)=> "Image" & Number.ToText(x))),
      AllHeaders = List.Distinct(List.Combine(GetHeaders[headers])),
      CreateUrlsByHeader = Table.AddColumn(GetHeaders, "urlsByHeader", (row)=> Record.FromList(row[urls], row[headers])),
      SelectColumns = Table.SelectColumns(CreateUrlsByHeader, {"id", "urlsByHeader"}),
      #"Expanded urlsByHeader" = Table.ExpandRecordColumn(SelectColumns, "urlsByHeader", AllHeaders, AllHeaders)
    in
      #"Expanded urlsByHeader"
    

    Results:

    enter image description here