matlaboopmatlab-class

How much overhead is created by the oop paradigm in matlab


In matlab every class method seems like an ordinary method whose first argument is the object itself.

Such a paradigm is quite acceptable for other oop languages such as python, for the classes are passed by reference. On the other hand, matlab, by default passes objects by value (except the handle classes).

From all of these I infer that using even the simplest setter function (or any other class method) will cause the entire object to be copied.

For instance, here is the signature of some class method in matlab:

classdef foo
  methods
    function obj = set.myParam(obj,value);
    function myfun(obj, value);
  end
end

In this case, will matlab copy the entire fooObj=foo() when I call fooObj.myfun(5) (or simply myfun(fooObj,5))?

Isn't this an incredibly big overhead? Copying entire object for every class method (and setter) seems to me incredibly inefficient.

Do I miss something? Is there anyway to avoid such a situation in matlab while still using oop techniques?

Do I have to use handle classes to prevent such performance overhead?


Solution

  • If you want your class to have reference semantics then yes, you need to use a handle class rather than a value class.

    But note that although by default MATLAB passes arguments by value, it also uses lazy copying or copy on write, so a copy is only made of the input arguments if the input arguments need to be modified. In addition, if the input argument is a structure or an object, a copy is only made of the parts (fields, properties) that need to be modified.

    And also in addition, MATLAB has an in-place optimization such that if the output argument is the same as an input argument, and the operations on the input argument can be done in-place, then a copy need not be made.

    So, for example, consider this function:

    function y = timestwo(x)
    y = 2*x;
    

    If you start with a variable a in the base workspace (let's say it's a very large array of doubles) and call b = timestwo(a), a copy is not made of a, as x is not modified during the function. Memory usage only increases when assigning the output argument y.

    But consider this function:

    function y = timestwoconj(x)
    x = x';
    y = 2*x;
    

    Now the memory usage increases during the function's execution, as a copy is made of x as it is modified. The same space is allocated when y is calculated, and then when the function exits the temporary copy of x is cleared.

    This illustrates the copy on write behaviour.

    Consider also the following function:

    function x = timestwo(x)
    x = 2*x;
    

    Here the output argument is the same as the input argument, and all the operations can be done in-place. If you call a = timestwo(a), no copy is made at all and memory usage does not increase. This illustrates the optimized in-place behaviour.

    Try implementing some functions similar to the above, applying them to a big array, then stepping through them in the debugger line by line while watching the memory usage in Task Manager - you'll get the idea.

    When implementing value classes in MATLAB, you would usually use syntax for your methods such as function obj = myfun(obj, value), rather than function myfun(obj, value). Methods have the same way of working as described above, so your object will only be copied if you are modifying it during the method.

    When you're working with value classes, that's what you want to happen - if you want reference semantics, use a handle class.

    Hope that helps!