matlabmaxgroupingsubsetmatlab-table

Return the maximum value of every subgroup in a table


I have a section of table like this:

X Y Value
1 2 6.9
1 3 6.8
1 4 8.1
2 1 7.2
2 3 11.7
2 4 16
3 1 22.6
3 2 20.5
3 3 18.1

For each group of rows having the same X, I need to select only the row having the largest Value. How can I generate a table like this?

X Y Value
1 4 8.1
2 4 16
3 1 22.6

The code I have so far produces just one row:

X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
Value = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
T = table(X,Y,Value);
[~,maxidx] = max(Value);
T(maxidx,:)
%{
ans =
  1×3 table
    X    Y    Value
    _    _    _____
    3    1    22.6 
%}

Solution

  • If you're using R2015b or newer, you could use splitapply:

    function T2 = q56413455()
    % Define some example inputs:
    X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
    Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
    Value = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
    T = table(X,Y,Value);
    % Call the "business logic" and assign the output:
    T2 = getMaxRows(T);
    
    function out = getMaxRows(T)
    GROUPING_VAR = 1; % We assume that the 1st column contains the grouping variable
    varnames = T.Properties.VariableNames;
    tmp = splitapply(@maxrow, T, T.(varnames{ GROUPING_VAR }));
    out = array2table(tmp, 'VariableNames', varnames );
    
    function outrow = maxrow(varargin)
    COL_WITH_MAX = 3; % We assume that the 3rd columns is the one that matters for max()
    subArr = cell2mat(varargin);
    [~,I] = max(subArr(:,COL_WITH_MAX));
    outrow = subArr(I,:);
    

    Calling this produces the desired result:

    >> q56413455
    ans =
      3×3 table
        X    Y    Value
        _    _    _____
        1    4     8.1 
        2    4      16 
        3    1    22.6 
    

    Another variation uses the 2nd output of splitapply(@max, ...) which is the index of the maximum within the group. We then need to add the amount of elements in previous groups to this (this is done using diff):

    X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
    Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
    Value = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
    T = table(X,Y,Value);
    
    % Get the position of the maximum Value in every group
    [~,I] = splitapply(@max, T.Value, T.X); % I == [3; 3; 1]
    
    % Get beginnings of every group
    lastGroupEnd = find([1; diff(X)])-1; % lastGroupEnd == [0; 3; 6]
    
    % Offset the maximum positions by group starts to get row indices in the original table
    T2 = T(I + lastGroupEnd, :);