matlabfile-ioline-count

Is there a way in Matlab to determine the number of lines in a file without looping through each line?


Obviously one could loop through a file using fgetl or similar function and increment a counter, but is there a way to determine the number of lines in a file without doing such a loop?


Solution

  • I like to use the following code for exactly this task

    fid = fopen('someTextFile.txt', 'rb');
    %# Get file size.
    fseek(fid, 0, 'eof');
    fileSize = ftell(fid);
    frewind(fid);
    %# Read the whole file.
    data = fread(fid, fileSize, 'uint8');
    %# Count number of line-feeds and increase by one.
    numLines = sum(data == 10) + 1;
    fclose(fid);
    

    It is pretty fast if you have enough memory to read the whole file at once. It should work for both Windows- and Linux-style line endings.

    Edit: I measured the performance of the answers provided so far. Here is the result for determining the number of lines of a text file containing 1 million double values (one value per line). Average of 10 tries.

     Author           Mean time +- standard deviation (s)
    ------------------------------------------------------
     Rody Oldenhuis      0.3189 +- 0.0314
     Edric (2)           0.3282 +- 0.0248
     Mehrwolf            0.4075 +- 0.0178
     Jonas               1.0813 +- 0.0665
     Edric (1)          26.8825 +- 0.6790
    

    So fastest are the approaches using Perl and reading all the file as binary data. I would not be surprised, if Perl internally also read large blocks of the file at once instead of looping through it line by line (just a guess, do not know anything about Perl).

    Using a simple fgetl()-loop is by a factor of 25-75 slower than the other approaches.

    Edit 2: Included Edric's 2nd approach, which is much faster and on-par with the Perl solution, I'd say.