google-cloud-vertex-aigoogle-cloud-aigoogle-gemini

Does Gemini embedding model support languages other than English?


I am planning to use the Gemini embedding model (models/embedding-001) for document/query retrieval, but I can't find anywhere in the documentation if it support languages other than English. Specifically Greek. I wonder if I can get an accurate embedding if my document and query are both in Greek and if my document is in Greek and my query is in English.


Solution

  • If you want to get embeddings for Greek, you can use vertexai's text embedding model textembedding-gecko@001 which supports it. Code example:

    from vertexai.language_models import TextEmbeddingModel
    
    def text_embedding() -> list:
        """Text embedding with a Large Language Model."""
        model = TextEmbeddingModel.from_pretrained("textembedding-gecko-multilingual@001")
        embeddings = model.get_embeddings(["What is life?"])
        for embedding in embeddings:
            vector = embedding.values
            print(f"Length of Embedding Vector: {len(vector)}")
        return vector