data-structurespowerbipowerquerydummy-variable

Is there a way to create dummy variables based on two columns in Power Query?


I have a large dataset with customer purchase data. I would like to create dummy variables for each combination of purchase order (1st purchase, 2nd purchase, etc.) and product category.

Sample table:

enter image description here

Note that several categories can be in the same purchase.

Sample output:

enter image description here

Apologies for using images, stack overflow doesn't allow me to paste tables 'cause I'm too new :)

Thank you in advance!

I tried doing all this manually but it is very arduous and purchase order will increase over time.


Solution

  • This is a really bad data structure, but if you want to get that, one way is below

    let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Removed Other Columns" = Table.Distinct(Table.SelectColumns(Source,{"Purchase Order"})),
    #"Added Custom" = Table.AddColumn(#"Removed Other Columns", "Custom", each Table.Distinct(Table.SelectColumns(Source,{"Category"}))),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Category"}, {"Category"}),
    #"Added Custom1" = Source & Table.AddColumn(#"Expanded Custom", "Customer", each "Dummy"),
    #"Sorted Rows1" = Table.Sort(#"Added Custom1",{{"Category", Order.Ascending}, {"Purchase Order", Order.Ascending}}),
    #"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Sorted Rows1", {{"Purchase Order", type text}}, "en-US"),{"Purchase Order", "Category"},Combiner.CombineTextByDelimiter("_", QuoteStyle.None),"Merged"),
    #"Added Custom2" = Table.AddColumn(#"Merged Columns", "Count", each 1),
    #"Pivoted Column" = Table.Pivot(#"Added Custom2", List.Distinct(#"Added Custom2"[Merged]), "Merged", "Count", List.Sum),
    #"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each ([Customer] <> "Dummy"))
    in  #"Filtered Rows"
    

    enter image description here