pythonlangchainlarge-language-modelhuggingface

Why Langchain HuggingFaceEmbeddings model dimension is not the same as stated on HuggingFace


I was using langchain HuggingFaceEmbeddings model: dunzhang/stella_en_1.5B_v5. When I look at https://huggingface.co/spaces/mteb/leaderboard, I can see that the model is 8192. But when I do

len(embed_model.embed_query("hey you"))

It gives me 1024. Why this difference please ?


Solution

  • According to the documentation at dunzhang/stella_en_1.5B_v5

    The default dimension is 1024, if you need other dimensions, please clone the model and modify modules.json to replace 2_Dense_1024 with another dimension, e.g. 2_Dense_256 or 2_Dense_8192

    You will need to download the model and use the local model after making the config changes