I am trying to read a text file with data according to a specific format. I am using and textscan
together with a string containing the format to read the whole data set in one code line. I've found how to read the whole line with fgetl
, but I would like to use as few code lines as possible. So I want to avoid own for loops. textscan
seems great for that.
As an example I'll include a part of my code which reads five strings representing a modified dataset, its heritage (name of old dataset), the date and time of the modification and lastly any comment.
fileID = fopen(filePath,'r+');
readContentFormat = '%s = %s | %s %s | %s';
content = textscan(fileID, readContentFormat, 'CollectOutput,1);
This works for the time being if the comment doesn't have any delimiters (like a white space) in it. However, I would like to be able to write comments at the end of the line.
Is there a way to use textscan
and let it know that I want to read the rest of a line as one string/character array (including any white spaces)? I am hoping for something to put in my variable readContentFormat
, instead of that last %s
. Or is there another method which does not involve looping through each row in the file?
Also, even though my data is very limited I am keen to know any pros or cons with different methods regarding computational efficiency or stability. If you know something you think is worth sharing, please do so.
One way that is satisfactory to me (but please share any other methods anyway!) is to set the delimiters to characters other than white space, and trim away any leading or trailing white spaces with strtrim
. This seemed to work well, but I have no idea how demanding the computations are.
The text file 'testFile.txt' in the current folder has the following lines
File |Heritage |Date and time |Comment file1.mat | oldFile1.mat | 2018-03-01 14:26:00 | - file2.mat | oldFile2.mat | 2018-03-01 13:26:00 | - file3.mat | oldFile3.mat | 2018-03-01 12:26:00 | Time for lunch!
The following code will read the data and put it into a cell array without leading or trailing white spaces, with few lines of code. Neat!
function contentArray = myfun()
fileID = fopen(testFile.txt,'r');
content = textscan(fileID, '%s%s%s%s','Delimiter', {'|'},'CollectOutput', 1);
contentArray = strtrim(content{1}(2:4,:));
end
The output:
tmpArr =
3×4 cell array
'file1.mat' 'oldFile1.mat' '2018-03-01 14:26:00' '-'
'file2.mat' 'oldFile2.mat' '2018-03-01 13:26:00' '-'
'file3.mat' 'oldFile3.mat' '2018-03-01 12:26:00' 'Time for lunch!'