Learning Scala from the Scala for Data Science book and the companion Github repo, here I am particularly talking about this function, copied below for reference.
def fromList[T: ClassTag](index: Int, converter: String => T): DenseVector[T] =
DenseVector.tabulate(lines.size) { row => converter(splitLines(row)(index)) }
What does the DenseVector.tabulate(lines.size)
mean between the =
sign and the function body definition? New to scala (with background from python and C++), so cannot figure out if that DenseVector.tabulate(lines.size)
is a local variable of the function being defined (when it should be declared inside the definition) or something else? It cannot be the return type, from what I understand of scala syntax.
Also, is the ClassTag
equivalent to template in C++?
To help you answer the question,
splitLines
has type scala.collection.immutable.Vector[Array[String]]
lines.size
is an unsigned int (obvious, but still making it clear)DenseVector.tabulate
is a factory function (defined on the companion object of DenseVector
) that has two parameter lists with one parameter each (so altogether, it takes two explicit parameters: size: Int
and a function f: Int => V
).
You can find its definition here (as part of the breeze library).
In (pseudo-)C++ (ignoring the ClassTag
), the corresponding declaration would probably look something like this:
template<classname V>
class DenseVector {
public:
// ... other class members
template<classname V>
static DenseVector<V> tabulate(int size, std::function<V(int)> f);
};
and then fromList
would probably look something like this:
template<classname T>
static DenseVector<T> fromList(int index, std::function<T(std::string)> converter) {
return DenseVector::tabulate(lines.size, [&converter](int row){
return converter(splitLines(row)[index]);
});
}