[SOLVED] Neo4j vector similarity function

Neo4j vector similarity function

I'm trying to understand the difference between the vector.similarity.cosine Cypher function and the gds.similarity.cosine function in Neo4j. According to the Neo4j documentation, both are used to compute cosine similarity, but I’m getting different results from them.

For example, given the following vectors:

Vector A: [1.0, 5.0, 3.0, 6.7]
Vector B: [5.0, 2.5, 3.1, 9.0]]

When I use vector.similarity.cosine(A, B), I get result 0.941, but using gds.similarity.cosine(A, B) should give 0.882. The equation cosine similarity (with numpy) calculation gives 0.882.

Why are these values different? Is there a difference in normalization, implementation details, or expected input formats between the two functions?

enter image description here

Any insights would be appreciated.

Solution

The Cypher manual documents (in the "Learn more about the cosine similarity function" dropdown) that Neo4j's vector index uses a normalized cosine similarity function that maps values to the range [0, 1] rather than the traditional [-1, 1].

While the normalized and traditional calculations are equally valid for comparing similarity, you must avoid mixing their results in the same context or comparison.