matlabtextscan

matlab textscan gives me wrong number of lines


I have a file names inputR_revised.tsv at https://www.dropbox.com/s/vtby4027rvprhga/inputR_revised.tsv?dl=0
In matlab, I typed

fid=fopen('BMC3C/example/inputR_revised.tsv','r')
covTable = textscan(fid,['%s',repmat('%.8n',[1,20])],'HeaderLines',1);

I get covTable{1,1} of size 41699 times 1. However when I type the following at terminal

wc -l inputR_revised.tsv

I get 41677. Why does it differ? I have used sed and cut to modify the original file to get inputR_revised.tsv. Is this the reason?

Is there a way to fix this?


Solution

  • %.8 is not enough if you have decimals printed with more than 8 digits. For these cases digits after the 8th decimal could be treated as a separate entry. That will make more numbers than expected. You should use a higher value for number of decimals in the scan format. For example,

    fid=fopen('BMC3C/example/inputR_revised.tsv','r')
    covTable = textscan(fid,['%s',repmat('%.18n',[1,20])],'HeaderLines',1);
    

    This should give you the correct number of rows.