matlab-load

load() - ignoring a given string


I am trying to use the load() function in MATLAB to read in data from a text file. However, every line of the text file ends with '...'. The data file is not produced by MATLAB, so I have no control over the source of the ellipses.

The data file I'm loading in looks something like this:

11191425        NaN     NaN     0.0 ...
11191426        NaN     NaN     0.0 ...
11191427        NaN     NaN     0.0 ...
11191428        NaN     NaN     0.0 ...
11191429     2280.5  1910.1   455.0 ...
11191430     2280.5  1910.1   455.0 ...
11191431     2298.0  1891.1   454.0 ...
11191432     2317.3  1853.7   453.0 ...
11191433     2335.6  1811.1   458.0 ...
11191434     2350.6  1769.8   466.0 ...
11191435     2365.3  1729.7   475.0 ...
11191436     2379.5  1691.2   485.0 ...
11191437     2378.3  1647.6   492.0 ...
11191438     2375.4  1621.3   499.0 ...
11191439     2372.7  1598.5   499.0 ...
11191440     2372.7  1598.5   499.0 ...
11191441        NaN     NaN     0.0 ...
11191442      294.9  1283.5  1163.0 ...
11191443      294.9  1283.5  1163.0 ...

Its actual length is in excess of 100,000 rows, but you get the idea. Using the load() command throws an error because of the '...'s at the end of each line. All I'm looking for is to read in those first four columns.

What would be the most efficient way of loading the data in, whilst completely omitting the rogue column of ellipses at the end? A method that doesn't involve making the system parse the whole text file twice would be preferable, though not necessary.


Solution

  • This is pretty easy if instead of using load, you use textscan. You can treat that last column as a string column and then just ignore it.

    fid = fopen('data.txt');
    data = textscan(fid,'%d %f %f %f %s');
    fclose(fid);
    

    You could then make the output a single matrix by concatenating the columns you want to keep together.

    data = [data{1:4}];
    

    The fifth column is just filled with '...' strings. You can just ignore it.