node.jsword-embeddinggoogle-cloud-vertex-ai

Convert vertex ai text embedding response to vector representation


I am trying to generate a vector representation of a text to then put it into my search database to perform operations, such as semantic search or recommendations.

To do this I first use the vertex AI (textembedding-gecko-multilingual@latest) to request the data. Unfortunately, I am unable to convert this response to a readable array of floats.

...
// Predict request
  const [response] = await predictionServiceClient.predict(request);
  const predictions = response.predictions;
  console.log("\tPredictions:");
  for (const prediction of predictions) {
    console.log(`\t\tPrediction : ${JSON.stringify(prediction)}`);
  }

Which results in a quite unusable format, that I cannot put into my search database. Note that there is the vector inside and each field is written as {"numberValue":-0.08402...,"kind":"numberValue"}:

{"predictions":[{"structValue":{"fields":{"embeddings":{"structValue":{"fields":{"statistics":{"structValue":{"fields":{"token_count":{"numberValue":1,"kind":"numberValue"},"truncated":{"boolValue":false,"kind":"boolValue"}}},"kind":"structValue"},"values":{"listValue":{"values":[{"numberValue":0.016404684633016586,"kind":"numberValue"},{"numberValue":-0.08402395248413086,"kind":"numberValue"} ....

Sure I could write a JS code to decode the response, but I believe this is not the best practice (as syntax might change). So, How can I retrieve the vector inside and make it readable - possibly with a function provided by the API?


Solution

  • I found the answer. In the node.js code they use a function to convert an. javascript object to an ÌValue with toValue(). You can use the same to decode the response of the prediction:

    const aiplatform = require('@google-cloud/aiplatform');
    const {PredictionServiceClient} = aiplatform.v1;
    const {helpers} = aiplatform;
    
    
    helpers.fromValue(prediction); // Note: this cannot be a list of predictions