matlabpivotmatlab-table

Unstacking with 2 conditions


I am using MATLAB 2021b and I have the following data:

 ID = {'a','a','a','a','b','b','b','b'}';
 DATE = [2010,2010,2011,2011,2011,2011,2012,2012]';
 FIELD_ID = {'f1','f2','f1','f2','f1','f2','f1','f2'}';
 VALUE = [1,2,5,6,1,1,7,8]';
 T_before = table(ID,DATE,FIELD_ID,VALUE);

T_before =

8×4 table

 ID      DATE    FIELD_ID    VALUE
_____    ____    ________    _____

{'a'}    2010     {'f1'}       1  
{'a'}    2010     {'f2'}       2  
{'a'}    2011     {'f1'}       5  
{'a'}    2011     {'f2'}       6  
{'b'}    2011     {'f1'}       1  
{'b'}    2011     {'f2'}       1  
{'b'}    2012     {'f1'}       7  
{'b'}    2012     {'f2'}       8 

In reality the table is a lot longer and contains more fields. The latest DATE for a given ID can be different. The column FIELD_ID contains fields and their respective value is in the column VALUE.

What I would like to do is unstack this table in long format to have one row per ID with the fields as colums. I have one condition, I want only to unstack the rows containing the latest value in the field DATE. It should look like the following:

T_after =

2×4 table

 ID      DATE    f1    f2
_____    ____    __    __

{'a'}    2011    5     6 
{'b'}    2012    7     8 

Solution

  • One for loop can do this quickly.

    ID = {'a','a','a','a','b','b','b','b'}';
    DATE = [2010,2010,2011,2011,2011,2011,2012,2012]';
    FIELD_ID = {'f1','f2','f1','f2','f1','f2','f1','f2'}';
    VALUE = [1,2,5,6,1,1,7,8]';
    T_before = table(ID,DATE,FIELD_ID,VALUE);
    
    % get unique IDs
    id_unique = unique(ID);
    
    n = numel(id_unique);
    f1 = cell(1,n);
    f2 = f1;
    dates = f1;
    
    for i = 1:n
        %filter DATE by unique ID, then get the latest date
        filter_id = ismember(ID, id_unique{i});
        date_filtered = DATE(filter_id);
        dates{i} = max(date_filtered);
        
        %filter FIELD_ID and VALUE by unique ID and dates
        filter_date = DATE == dates{i};
        filter = filter_id & filter_date;
        field_id_filtered = FIELD_ID(filter);
        value_filtered = VALUE(filter);
        
        %find F1 and F2
        if strcmp(field_id_filtered{1}, 'f1')
            f1{i} = value_filtered(1);
            f2{i} = value_filtered(2);
        else
            f1{i} = value_filtered(2);
            f2{i} = value_filtered(1);
        end
    end
    
    %make the final table
    T = table(id_unique(:),dates(:),f1(:),f2(:), 'VariableNames',{'ID','DATE','f1','f2'});