google-cloud-firestorefirebase-genkitgenkit

Genkit Retriever: Filter Collection Before Running Genkit Retriever in Firestore?


I’m working with Genkit’s retriever in Firebase , aiming to filter a collection before applying an embedding match. Specifically, I’d like to filter documents where uid equals a specific value (e.g., 123) before performing the nearest neighbour search on embeddings.

The typical way to filter a Firestore collection is like this:

where('uid', '==', 'my uid')

Here's my current setup for the retriever based on Genkit documentation. I may be missing something, or perhaps pre-filtering isn’t supported? It seems essential to be able to filter down to user-specific documents before executing the similarity search.

const firestoreRetriever = defineFirestoreRetriever({
  name: 'firestore-articles',
  firestore: getFirestore(),
  collection: 'articles',
  contentField: 'fullText',
  vectorField: 'embeddings',
  embedder: textEmbeddingGecko,
  distanceMeasure: 'COSINE',
  metadataFields: ['path']
});

This retriever is being called as follows:

const docs = await retrieve({
  retriever: firestoreRetriever,
  query,
  options: { 
    k: 1,
    limit: 10,
  },
});

Does Genkit provide a method for applying a pre-filter on a specific field (e.g., uid) within the defineFirestoreRetriever function, or would this require a workaround? Any guidance or alternative approaches would be appreciated.


Solution

  • Have you tried:

    const docs = await retrieve({
      retriever: firestoreRetriever,
      query,
      options: { 
        k: 1,
        limit: 10,
        where: {
          uuid: '123',
        },
      },
    });
    

    The where option is mentioned in the available retriever options for Firestore vector store: https://firebase.google.com/docs/genkit/plugins/firebase#retrievers