powerbipowerquerym

How do I properly use table.group in a PowerQuery query to dynamically summarize different rows and columns?


I created a table in Excel 2016 where I put various names of columns, columns I wanted to summarize, names to set new column names to, etc. Essentially what I wanted to be a parameter table.

I then created a function in the Power Query editor and passed the values in the aforementioned table into the function. At first it failed, but I found Power Query column name as parameter which showed me how to pass one of the column names to a parameter.

On first glance the group function seemed to work, I saw the months I wanted to group by, but the value summed up the whole table for each month instead of individually. I started by editing the data in a normal query:

let
Source = #"Final Dataset",
#"Grouped Rows" = Table.Group(Source, {"DateValue"}, {{"xOccurrences", each List.Sum([xOccurrences]), type number}}),
#"Added Custom4" = Table.AddColumn(#"Grouped Rows", "Plant Region", each "Ttl Occ PR"),
#"Added Custom" = Table.AddColumn(#"Added Custom4", "SD Region", each null)
in
#"Added Custom"

The result of the query (as expected):

before the query

I then parameterized it:

(groupByColumn as text, aggregateColumnName, optional plantRegionValue as text, optional sdRegionValue as text) =>
let
Source = #"Final Dataset",
#"Grouped Rows" = Table.Group(Source, {groupByColumn}, {{aggregateColumnName , each List.Sum(Table.Column(Source, aggregateColumnName)), type number}}),
#"Added Custom4" = Table.AddColumn(#"Grouped Rows", "Plant Region", each plantRegionValue),
#"Added Custom" = Table.AddColumn(#"Added Custom4", "SD Region", each sdRegionValue)
in
#"Added Custom"

I executed the function against the parameter table with two dummy rows. Two input rows each multiplied by 12 month rows turned into 24 (original dataset is 12 months) as expected, but the summed values are the total of the dataset (1528):

function results

As you can see the Group function seems to ignore the second parameter I've assigned to the group function. I tried various replacing {groupByColumn} with various concatenations, the Table.Column function as with the column to sum, and statically replacing it with the original column name {"DateValue"}. The column function failed, while all of the other attempts yielded the same results, each month paired with the sum of the whole dataset, 1528.

I thought it must be ignoring the second parameter, so I tried leaving it empty:

#"Grouped Rows" = Table.Group(Source, {}
    , {{aggregateColumnName , each List.Sum(Table.Column(Source, aggregateColumnName)), type number}}),

which as expected yielded 2 rows with values of 1528. So it's only partially ignoring parameter 2:

enter image description here

Does anyone know how to pass variable columns to the group function to dynamically process a parameter table? I'm seeking to dynamically calculate the results from the first table.


Solution

  • I don't understand why you're getting doubled output, but the reason that it's not summing as you expect it to is that when you write each List.Sum(Table.Column(Source, aggregateColumnName)), it's not taking into account the local context but rather summing over the entire Source table.

    To fix this, instead of using each, let's use a function instead like this

    (LocalContext) => List.Sum(Table.Column(LocalContext, aggregateColumnName))
    

    The whole line would be:

    #"Grouped Rows" =
    Table.Group(
        Source,
        {groupByColumn},
        {{
            aggregateColumnName,
            (LocalContext) => List.Sum(Table.Column(LocalContext, aggregateColumnName)),
            type number
        }}
    )
    

    Reading material:

    https://bengribaudo.com/blog/2017/12/08/4270/power-query-m-primer-part3-functions-function-values-passing-returning-defining-inline-recursion

    https://excelguru.ca/each-keyword-power-query/