hi how can I add a custom filter to a sql function in a supabase database that uses pgvector to store embeddings.
create table documents (
id bigserial primary key,
content text,
embedding vector(1536),
custom_id text // Custom id for each company
);
Something like this:
create
or replace function match_documents (
query_embedding vector (1536),
match_threshold float,
match_count int,
custom_id text
) returns table (id bigint, content text, similarity float) language sql stable as $$
select
documents.id,
documents.content,
1 - (documents.embedding <=> query_embedding) as similarity
from documents
where documents.custom_id = match_documents.custom_id AND 1 - (documents.embedding <=> query_embedding) > match_threshold
order by similarity desc
limit match_count;
$$;
Call from project:
const { data: documents, error } = await supabaseClient.rpc('match_documents', {
query_embedding: embedding,
match_threshold: .73, // Choose an appropriate threshold for your data
match_count: 10, // Choose the number of matches
custom_id: "2",
})
Within the database I have different embeddings, and if I ask a question by passing the custom_id, it does not respect the where filter, and it responds with information about embeddings that do not belong to that cutom_id.
Is there any way of filtering the documents and then... doing all the embeddings query logic?
Thanks!!!
I would add wrap the threshold filter with parenthesis just in case, but otherwise everything looks good. I would double check that your function definition is up to date. Try dropping the function first and then recreating the function to make sure the function definition is up to date.
create or replace function match_documents (
query_embedding vector (1536),
match_threshold float,
match_count int,
custom_id text
) returns table (id bigint, content text, similarity float, custom_id text) language sql stable as $$
select
documents.id,
documents.content,
1 - (documents.embedding <=> query_embedding) as similarity,
documents.custom_id
from documents
where
documents.custom_id = match_documents.custom_id AND
(1 - (documents.embedding <=> query_embedding) > match_threshold)
order by similarity desc
limit match_count;
$$;