I need help with getting batch predictions for Gemini using Google Cloud API
I'm trying to run a batch prediction using the Gemini model via the Google Cloud API (following this official doc - https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/batch-prediction-api#generative-ai-batch-text-python_genai_sdk), but I’m stuck on a 400 INVALID_ARGUMENT error when calling client.batches.create(). Looks like the issue might be with the src argument — I used the example from the doc:
src="
gs://cloud-samples-data/batch/prompt_for_batch_gemini_predict.jsonl"
or
src="
bq://storage-samples.generative_ai.batch_requests_for_multimodal_input"
But neither works — I get this error:
ClientError: 400 INVALID_ARGUMENT. {'error': {'code': 400, 'message': 'Request contains an invalid argument.', 'status': 'INVALID_ARGUMENT'}}
Anyone tried this before and knows what the correct src value should be for a text prompt batch? Or what format/path I should use here?
The whole code I tried to run:
import time
from google import genai
from google.genai.types import CreateBatchJobConfig, JobState, HttpOptions
import os
os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = "True"
client = genai.Client(
api_key=credentials.token,
http_options=HttpOptions(api_version="v1")
)
output_uri = "gs://llm_marking_up/movie_categorization_results"
job = client.batches.create(
model="gemini-2.0-flash-001",
src="gs://cloud-samples-data/batch/prompt_for_batch_gemini_predict.jsonl",
config=CreateBatchJobConfig(dest=output_uri),
)
print(f"Job name: {job.name}")
print(f"Job state: {job.state}")
completed_states = {
JobState.JOB_STATE_SUCCEEDED,
JobState.JOB_STATE_FAILED,
JobState.JOB_STATE_CANCELLED,
JobState.JOB_STATE_PAUSED,
}
while job.state not in completed_states:
time.sleep(30)
job = client.batches.get(name=job.name)
print(f"Job state: {job.state}")
The error log:
File ~/Library/Caches/pypoetry/virtualenvs/ksb-research-b3BAYwXa-py3.11/lib/python3.11/site-packages/google/genai/batches.py:752, in Batches.create(self, model, src, config)
729 """Creates a batch job.
730
731 Args:
(...)
749 print(batch_job.state)
750 """
751 config = _extra_utils.format_destination(src, config)
--> 752 return self._create(model=model, src=src, config=config)
File ~/Library/Caches/pypoetry/virtualenvs/ksb-research-b3BAYwXa-py3.11/lib/python3.11/site-packages/google/genai/batches.py:458, in Batches._create(self, model, src, config)
...
--> 101 raise ClientError(status_code, response_json, response)
102 elif 500 <= status_code < 600:
103 raise ServerError(status_code, response_json, response)
ClientError: 400 INVALID_ARGUMENT. {'error': {'code': 400, 'message': 'Request contains an invalid argument.', 'status': 'INVALID_ARGUMENT'}}
I was able to run your code without any problem (with GCS bucket approach). Couple of things to verify:
Upgrade the Google GenAI SDK:
pip install --upgrade google-genai
This is highly recommended since we may upgrade SDK regularly.
Please export the GCP project and API configuration:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=your_project_id
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True
If you are running with GCP default credential authentication, you don't need to specify API key in the Client initiation:
client = genai.Client(
# api_key=credentials.token,
http_options=HttpOptions(api_version="v1")
)
After running the program, you should see:
Job name: projects/xxx/locations/us-central1/batchPredictionJobs/xxx
Job state: JOB_STATE_PENDING
Job state: JOB_STATE_RUNNING
Job state: JOB_STATE_RUNNING
Job state: JOB_STATE_RUNNING
Job state: JOB_STATE_RUNNING
Job state: JOB_STATE_RUNNING
Job state: JOB_STATE_RUNNING
Job state: JOB_STATE_RUNNING
Job state: JOB_STATE_SUCCEEDED
and output file batch_prompt_for_batch_gemini_predict.jsonl
should be in your GCS bucket.