I'm currently working with the Gemini AI API 1.5 Pro (latest version) and need to send large video files for inference. These videos are several hundred megabytes each (~700MB) but are within the API's constraints (e.g., less than 1 hour in length). I want to upload them once and perform inference without re-uploading.
In GPT-4o, there was an option to use image_url
s to reference images. Is there a similar method or best practice for handling large video files with the Gemini AI API 1.5 Pro?
The videos are too large to send repeatedly, so an efficient method for uploading and referencing them is crucial.
Any guidance on API endpoints, required parameters, or example code snippets would be greatly appreciated.
In your situation, how about the following sample script?
Before you test the following script, please update google-generativeai
to the latest version.
import google.generativeai as genai
import time
apiKey = "###" # Please set your API key.
video_file_name = "sample.mp4" # Please set your video file with the path.
display_name = "sampleDisplayName" # Please set the display name of the uploaded file on Gemini. The file is searched from the file list using this value.
genai.configure(api_key=apiKey)
# Get file list in Gemini
fileList = genai.list_files(page_size=100)
# Check uploaded file.
video_file = next((f for f in fileList if f.display_name == display_name), None)
if video_file is None:
print(f"Uploading file...")
video_file = genai.upload_file(path=video_file_name, display_name=display_name, resumable=True)
print(f"Completed upload: {video_file.uri}")
else:
print(f"File URI: {video_file.uri}")
# Check the state of the uploaded file.
while video_file.state.name == "PROCESSING":
print(".", end="")
time.sleep(10)
video_file = genai.get_file(video_file.name)
if video_file.state.name == "FAILED":
raise ValueError(video_file.state.name)
# Generate content using the uploaded file.
prompt = "Describe this video."
model = genai.GenerativeModel(model_name="models/gemini-1.5-pro-latest")
print("Making LLM inference request...")
response = model.generate_content([video_file, prompt], request_options={"timeout": 600})
print(response.text)
In this sample script, when the file has already been uploaded, the existing file is used. On the other hand, when the file is not found, the file is uploaded and the uploaded file is used. In order to search the file, in this sample, display_name
is used.
As another approach, when the value of name
can be directly given, the following sample script can be also used. In this case, the value of name is required to be the unique value in the uploaded files.
import google.generativeai as genai
import time
apiKey = "###" # Please set your API key.
video_file_name = "sample.mp4" # Please set your video file with the path.
name = "sample-name-1" # Please set the name of the uploaded file on Gemini. The file is searched from the file list using this value.
genai.configure(api_key=apiKey)
# Check uploaded file.
try:
video_file = genai.get_file(f"files/{name}")
print(f"File URI: {video_file.uri}")
except:
print(f"Uploading file...")
video_file = genai.upload_file(path=video_file_name, name=name, resumable=True)
print(f"Completed upload: {video_file.uri}")
# Check the state of the uploaded file.
while video_file.state.name == "PROCESSING":
print(".", end="")
time.sleep(10)
video_file = genai.get_file(video_file.name)
if video_file.state.name == "FAILED":
raise ValueError(video_file.state.name)
# Generate content using the uploaded file.
prompt = "Describe this video."
model = genai.GenerativeModel(model_name="models/gemini-1.5-pro-latest")
print("Making LLM inference request...")
response = model.generate_content([video_file, prompt], request_options={"timeout": 600})
print(response.text)
This script is the same result with the above script.