javaapache-flinkflinkml

Flink ML DenseVector API missing functionality


I’m new to Flink(and to Java) and I come from ML/DS background, so decided to implement something related to what I know - a linear regression learner. For that I figured I’d use DenseVector primitives available in flink.ml.*.

This is where I’m seriously confused, so would appreciate if anyone could help me here. Anyway, I started googling and found this https://nightlies.apache.org/flink/flink-docs-release-1.12/api/java/org/apache/flink/ml/common/linalg/DenseVector.html

This implementation have all the methods one would need to implement anything from linear algebra - dot product, summation, norm, etc. However, with this dependency

<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-ml-lib_2.12</artifactId>
    <version>2.0.0</version>
</dependency>

the only DenseVector I get installed is this one https://nightlies.apache.org/flink/flink-docs-release-1.3/api/java/org/apache/flink/ml/math/DenseVector.html

This API is a very slim version of the first one, and I’m confused why? If I want to use the latest version of the Flink ML lib, how do I get the API that is comparable with the first link in terms of feature set? For example, this API has the dot product, but not summation, which is confusing.

I also found this API, that practically has not linear algebra related methods https://nightlies.apache.org/flink/flink-ml-docs-master/api/java/org/apache/flink/ml/linalg/DenseVector.html

How does it fit into the picture?

I also noticed that all these APIs implement a different set of interfaces.

Basically, the final question is: how do I get the API from the first link?


Solution

  • the first link https://nightlies.apache.org/flink/flink-docs-release-1.12/api/java/org/apache/flink/ml/common/linalg/DenseVector.html which you found is related based on flink-1.12. Flink 1.12 is kind of outdated as it was released in Dec 2020.

    The latest Flink ML related code has been moved from apache/flink into apache/flink-ml repo. The latest DenseVector source code can be found at https://github.com/apache/flink-ml/blob/master/flink-ml-core/src/main/java/org/apache/flink/ml/linalg/DenseVector.java. And the up-to-date Java doc can be found at https://nightlies.apache.org/flink/flink-ml-docs-master/api/java/.

    Also, the latest Flink ML website can be found at https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/java/quick-start/. The website provides Python and Java example code for all the algorithms.

    And the API for doing linear algebra can be found at https://github.com/apache/flink-ml/blob/master/flink-ml-core/src/main/java/org/apache/flink/ml/linalg/BLAS.java.

    Hope it helps!