matlablong-format-data

Convert a matrix to matrix in long format in MATLAB in a fast way


In MATLAB, how to convert a matrix to a long format?

In R, this can be done with functions such as reshape2, or melt, but in MATLAB the only way I could find was to use for loops to create a new matrix.

function wide_to_long(wide_array, save, save_path, precision_fun)
  "Wide to long is started ... for " + save_path
    if nargin == 1
      save = false;
    end
    tic
    n_dims = length(size(wide_array));
    dt_long = precision_fun(zeros(prod((size(wide_array)), 'all' ), n_dims+1));
    if n_dims == 2
        n_rows = size(wide_array,1);
        n_cols = size(wide_array,2);
        all_combinations = combvec(1:n_rows, 1:n_cols);
        parfor (i_comd = 1:size(all_combinations, 2), 3)
            comb = all_combinations(:,i_comd);
            dt_long(i_comd, :) = [comb', wide_array(comb(1), comb(2))];
        end
    end
    toc
    if save == true
        "Saving to " + save_path
        writematrix(dt_long, save_path, 'Delimiter',',');
    end
end

However, using loops in MATLAB is too slow. Saving a ~7GB matrix takes at least 40mins.

Is there a way to convert a matrix to a matrix in a long format in MATLAB in a fast way?


An example: If we have the wide_array = [10,20;30,40], then the long format of this array would be

long_array = [1,1,10; 1,2,20;2,1,30; 2,2,40]

Here the first two dimensions mark the position of the values in the wide_array, whereas the third dimension contains the values that were stored in the original array.


Solution

  • I have come up with the following function:

    function dt_long = wide_to_long(varargin)
      if nargin == 0
          "No input matrix is provided, dude!"
          return
      end
      defaults = {false, "~/output_wide_to_long.csv", @(x) single(x)};
      defaults(1:(nargin-1)) = varargin(2:end);
      wide_array = varargin{1,1};
      save = defaults{1,1};
      save_path = defaults{1,2};
      precision_fun = defaults{1,3};
      
      "Wide to long is started."
      if save == true
          "Save path " + save_path
      end
      
        if nargin == 1
          save = false;
        end
        tic
        n_dims = length(size(wide_array));
        dimensions = size(wide_array);
        indices = {};
        for i = 1:n_dims
            indices{1, i} = 1:dimensions(i);
        end
        all_combinations = combvec(indices{:});
        dt_long = precision_fun([all_combinations', reshape(wide_array, [], 1)]);
        toc
        if save == true
            "Saving to " + save_path
            writematrix(dt_long, save_path, 'Delimiter',',');
        end
    end
    

    An example use case would be

    wide_matrix = rand(4,3,2);
    wide_to_long(wide_matrix, true, "~/test.csv", @(x) single(x)
    

    In terms of its performance, it takes ~7 seconds to convert and save a matrix with 100 million elements to a CSV file.