matlabmatrixvectorization

Find column indices of a matrix entries that are repeated values


I'm using Matlab notation. I have a matrix A of size (Nx3) where each entry of A is a positive integer. I want to create a matrix Q of dimensions (max(max(A))x2). Row i of Q corresponds to the column indices of A so that A(k,r) = i. So, if A(k_1,r_1) = i and A(k_2,r_2) = i, we can set Q(i,:)=[k_1,k_2]. For the unique entries of A, the associated row of Q has each column equal.

Is there a way to vectorize the task? I can perform the task using loops, but it is very inefficient in MATLAB (when matrix A has large rows and max(max(A)) is also big). Here is an example.

A=[  1    10     9
     1     2     4
     3     2    11
     3    12     5
     6    13     4
     6     7    15
     8     7     5
     8    14    16]

In the above example, the expected result would be

Q =
     1     1
     2     2
     1     1
     3     3
     3     3
     1     1
     2     2
     1     1
     3     3
     2     2
     3     3
     2     2
     2     2
     2     2
     3     3
     3     3

This is what my slow code does

max_Q = max(max(A));
Q     = zeros(max_Q,2);
for i = 1 : max_Q
    [~,cols] = find(i == A); 
    Q(i,:) = cols;
end

Solution

  • You can use the linear indices returned by unique to give you the column number of each unique integer. Since the integers are given in sorted order by default, the list of indices (converted to column numbers) gives one column of the output.

    Given:

    A=[  1    10     9
         1     2     4
         3     2    11
         3    12     5
         6    13     4
         6     7    15
         8     7     5
         8    14    16]
    

    Find the indices of the first occurrence of each integer:

    [Y,I] = unique(A);   % Y is not necessary, only there for verification
    disp([Y,I]);
    
        1    1
        2   10
        3    3
        4   18
        5   20
        6    5
        7   14
        8    7
        9   17
       10    9
       11   19
       12   12
       13   13
       14   16
       15   22
       16   24
    

    (Note that if we were not guaranteed to have each integer appear at least once, we could use Y as a row index into the result array.)

    The linear indices can then be converted to (row and) column numbers using ind2sub:

    [~,C] = ind2sub(size(A),I)
    C =
    
       1
       2
       1
       3
       3
       1
       2
       1
       3
       2
       3
       2
       2
       2
       3
       3
    

    If both occurrences of an integer are guaranteed to be in the same column, we're done:

    Q = [C, C];
    

    If occurrences of an integer can be in different columns, we'll just use unique a second time:

    B = A;
    B(4,[2,3]) = B(4,[3,2])
    B =
    
        1   10    9
        1    2    4
        3    2   11
        3    5   12   <-- 12 and 5 swapped
        6   13    4
        6    7   15
        8    7    5
        8   14   16
    
    [~,I] = unique(B, "first");
    [~,C] = ind2sub(size(B),I);
    Q(:,1) = C;
    [~,I] = unique(B, "last");
    [~,C] = ind2sub(size(B),I);
    Q(:,2) = C;
    disp(Q);
    
       1   1
       2   2
       1   1
       3   3
       2   3   <-- 5's are in different columns
       1   1
       2   2
       1   1
       3   3
       2   2
       3   3
       3   3   <-- 12 is only in 1 column
       2   2
       2   2
       3   3
       3   3