I have a large dataset with customer purchase data. I would like to create dummy variables for each combination of purchase order (1st purchase, 2nd purchase, etc.) and product category.
Sample table:
Note that several categories can be in the same purchase.
Sample output:
Apologies for using images, stack overflow doesn't allow me to paste tables 'cause I'm too new :)
Thank you in advance!
I tried doing all this manually but it is very arduous and purchase order will increase over time.
This is a really bad data structure, but if you want to get that, one way is below
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Removed Other Columns" = Table.Distinct(Table.SelectColumns(Source,{"Purchase Order"})),
#"Added Custom" = Table.AddColumn(#"Removed Other Columns", "Custom", each Table.Distinct(Table.SelectColumns(Source,{"Category"}))),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Category"}, {"Category"}),
#"Added Custom1" = Source & Table.AddColumn(#"Expanded Custom", "Customer", each "Dummy"),
#"Sorted Rows1" = Table.Sort(#"Added Custom1",{{"Category", Order.Ascending}, {"Purchase Order", Order.Ascending}}),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Sorted Rows1", {{"Purchase Order", type text}}, "en-US"),{"Purchase Order", "Category"},Combiner.CombineTextByDelimiter("_", QuoteStyle.None),"Merged"),
#"Added Custom2" = Table.AddColumn(#"Merged Columns", "Count", each 1),
#"Pivoted Column" = Table.Pivot(#"Added Custom2", List.Distinct(#"Added Custom2"[Merged]), "Merged", "Count", List.Sum),
#"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each ([Customer] <> "Dummy"))
in #"Filtered Rows"