I'm using Gemini for my Rag implementation. In particular, the most recent model - GEMINI_25_FLASH_PREVIEW_04_17. I'm also using Gemini's REST API to embed the text before upsetting into my vector DB.
I'm trying to find the number of tokens just before I pass the text to the endpoint for embedding creation. Google's vertex python API provides this functionalit as explained in this accepted answer and the connected article: https://www.googlecloudcommunity.com/gc/AI-ML/Please-share-Gemini-tokenize-information/m-p/709495
However, when I used this, I get an error that the following exception:
{ "detail": "Model text-embedding-004 is not supported. Supported models: gemini-1.0-pro-001, gemini-1.0-pro-002, gemini-1.5-pro-001, gemini-1.5-flash-001, gemini-1.5-flash-002, gemini-1.5-pro-002.\n" }
I've a suspicion this API hasn't been updated. If not, am I correct to assume that the tokenisation remains the same across the gemini family? If not, then what other method can I use to determine the tokens. I know there are some other gemini compatible tokenisers but I'd rather use gemini's own solutions.
So you are using an older version of SDK (vertexai.preview) which doesn't support the newer Gemini models (i.e 2.0, 2.5). I would suggest that you use the latest unified Google GenAI SDK. Here is the suggestion:
Install the Python SDK:
pip install --upgrade google-genai
Execute the following code:
from google import genai
from google.genai.types import HttpOptions
client = genai.Client(http_options=HttpOptions(api_version="v1"))
response = client.models.count_tokens(
model="gemini-2.5-flash-preview-04-17",
contents="Hello World",
)
print(response)
# Example output:
# total_tokens=10
# cached_content_token_count=None
For detail, reference this doc