matlabvariablesmat-file

Fast way to check if variable is in .mat file without loading .mat file? 'who'/'whos' is not faster than loading.. Better options than 'who'?


I have a .mat file named "myfile.mat" that contains a huge varible data and, in some cases, another variable data_info. What is the fastest way to check if that .mat file contains the `data_info' variable?

the who or whos commands are not faster than simply loading and testing for the existens of varible.

nRuns=10;
%simply loading the complete file
tic
for p=1:nRuns
    load('myfile.mat');
    % do something with variable
    if exist('data_info','var')
        %do something
    end
end
toc

% check with who
tic
for p=1:nRuns
   variables=who('-file','myfile.mat');
   if ismember('data_info', variables)
       % do something
   end
end
toc

% check with whose
tic
for p=1:nRuns
   info=whos('-file','myfile.mat');
   if ismember('data_info', {info.name})
       %do something
   end
end
toc

All methods roughly take the same time (which is way to slow, since data is huge.

However, this is very fast:

tic
for p=1:nRuns
    load('myfile.mat','data_info');
    if exist('data_info', 'var')
        %do something
    end
end
toc

But it issues a warning, if data_info does not exist. I could suppress the warning, but that doesn't seem like the best way to do this.. What other options are there?

Edit using who('-file', 'myfile.mat', 'data_info') is also not faster:

tic
for p=1:nRuns
    if ~isempty(who('-file', 'myfile.mat', 'data_info'))
      % do something
    end
end
toc    % this takes 7 seconds, roughly the same like simply loading complete .mat file

Solution

  • Try using who restricting it to only the specific variable:

    ...
    if ~isempty(who('-file', 'myfile.mat', 'data_info'))
      %do something
    end
    

    Timing the solutions:

    Using timeit on the different solutions (code included below, running on Windows 7 and MATLAB version R2016b) shows that the who-based ones appear fastest, with the one I suggested above having a slight edge in speed. Here's the timing, from slowest to fastest:

    Load whole file:        0.368235871921381 sec
    Using matfile:          0.001973860748417 sec
    Load only `data_info`:  0.000316989486384 sec
    Using whos + ismember:  0.000174207817967 sec
    Using who + ismember:   0.000151289605527 sec
    Using who + isempty:    0.000137261391331 sec
    

    I used a sample MAT file containing the following variables:

    data = ones(10000);
    data_info = 'hello';
    

    Here's the test code:

    function T = infotest
    
      T = zeros(6, 1);
      T(1) = timeit(@use_load_exist_1);
      T(2) = timeit(@use_load_exist_2);
      T(3) = timeit(@use_matfile);
      T(4) = timeit(@use_whos_ismember);
      T(5) = timeit(@use_who_ismember);
      T(6) = timeit(@use_who_isempty);
    
    end
    
    function isThere = use_load_exist_1
      load('infotest.mat');
      isThere = exist('data_info', 'var');
    end
    
    function isThere = use_load_exist_2
      load('infotest.mat', 'data_info');
      isThere = exist('data_info', 'var');
    end
    
    function isThere = use_matfile
      isThere = isprop(matfile('infotest.mat'), 'data_info');
    end
    
    function isThere = use_whos_ismember
      info = whos('-file', 'infotest.mat');
      isThere = ismember('data_info', {info.name});
    end
    
    function isThere = use_who_ismember
      variables = who('-file', 'infotest.mat');
      isThere = ismember('data_info', variables);
    end
    
    function isThere = use_who_isempty
      isThere = ~isempty(who('-file', 'infotest.mat', 'data_info'));
    end