C++ does not have native support for lazy evaluation (as Haskell does).
I'm wondering if it is possible to implement lazy evaluation in C++ in a reasonable manner. If yes, how would you do it?
I'm wondering if it's possible to implement it in a more generic fashion, for example by using a parametrized class lazy that essentially works for T the way matrix_add works for matrix.
Any operation on T would return lazy instead. The only problem is to store the arguments and operation code inside lazy itself. Can anyone see how to improve this?
I'm wondering if it is possible to implement lazy evaluation in C++ in a reasonable manner. If yes, how would you do it?
Yes, this is possible and quite often done, e.g. for matrix calculations. The main mechanism to facilitate this is operator overloading. Consider the case of matrix addition. The signature of the function would usually look something like this:
matrix operator +(matrix const& a, matrix const& b);
Now, to make this function lazy, it's enough to return a proxy instead of the actual result:
struct matrix_add;
matrix_add operator +(matrix const& a, matrix const& b) {
return matrix_add(a, b);
}
Now all that needs to be done is to write this proxy:
struct matrix_add {
matrix_add(matrix const& a, matrix const& b) : a(a), b(b) { }
operator matrix() const {
matrix result;
// Do the addition.
return result;
}
private:
matrix const& a, b;
};
The magic lies in the method operator matrix()
which is an implicit conversion operator from matrix_add
to plain matrix
. This way, you can chain multiple operations (by providing appropriate overloads of course). The evaluation takes place only when the final result is assigned to a matrix
instance.
EDIT I should have been more explicit. As it is, the code makes no sense because although evaluation happens lazily, it still happens in the same expression. In particular, another addition will evaluate this code unless the matrix_add
structure is changed to allow chained addition. C++0x greatly facilitates this by allowing variadic templates (i.e. template lists of variable length).
However, one very simple case where this code would actually have a real, direct benefit is the following:
int value = (A + B)(2, 3);
Here, it is assumed that A
and B
are two-dimensional matrices and that dereferencing is done in Fortran notation, i.e. the above calculates one element out of a matrix sum. It's of course wasteful to add the whole matrices. matrix_add
to the rescue:
struct matrix_add {
// … yadda, yadda, yadda …
int operator ()(unsigned int x, unsigned int y) {
// Calculate *just one* element:
return a(x, y) + b(x, y);
}
};
Other examples abound. I've just remembered that I have implemented something related not long ago. Basically, I had to implement a string class that should adhere to a fixed, pre-defined interface. However, my particular string class dealt with huge strings that weren't actually stored in memory. Usually, the user would just access small substrings from the original string using a function infix
. I overloaded this function for my string type to return a proxy that held a reference to my string, along with the desired start and end position. Only when this substring was actually used did it query a C API to retrieve this portion of the string.