matlabmissing-datacell-arrayread-csv

<Missing> Values in Cell Arrays Created from Matlab readcell()


I am uploading data from an Excel file which was provided in a specific format, for which a minimum reproducible example is shown below:

Excel file with four columns of different lengths

I am trying to save each column into an array, using the following Matlab R2020b code:

params_filename = uigetfile({'*.xlsx','*.csv'});
param_config = readcell(params_filename);
greek_title = param_config(:,1);
japanese_title = param_config(:,3);
greek_value = param_config(:,2);
japanese_value = param_config(:,4);

However, because the Greek letter columns are longer than the Japanese letter columns, the output Japanese letter cell arrays have <missing> values at the end, as shown below:

cell array with three {1x1 missing} values

I can't use Range to read the correct cells, since the spreadsheet is provided by the user, and the column lengths are unpredictable. I would like to remove the <missing> values, or better yet, quit reading a column as soon as a missing value is detected. How do I eliminate the <missing> elements?


Solution

  • As you can see in the documentation to readcell, empty cells are filled with missing. This is a special value, the generic equivalent to NaN in numeric arrays.

    You can test for missing values using the function ismissing or anymissing. The first one returns a boolean for each element in the input array, the second one returns true if any element in the array is missing.

    A quirk here with cell arrays is that the missing elements are 1x1 arrays of missing values, and those are not recognized as such by ismissing for some reason. So you need to use cellfun to apply the function to each element in the cell array:

    index = cellfun(@anymissing, japanese_title);
    japanese_title(index) = [];  % Removes the missing elements
    

    For older versions of MATLAB without anymissing, use @(x)any(ismissing(x)).