Gemini 2.5 Flash
Approved Data Classifications
Description
Gemini 2.5 Flash is a highly efficient multimodal AI model developed by Google, designed to excel in automated tasks and low latency scenarios. Launched in June 2025, this model features a context window of over 1 million tokens featuring thinking capabilities for the first time in a Google flash model. This allows the model to process vast datasets and handle complex problems from different information sources, including text, audio, images, video and even entire code repositories.
Capabilities
Model | Training Data | Input | Output | Context Length | Cost (per 1 million tokens) |
---|---|---|---|---|---|
gemini-2.5-flash | January 2025 | Image , Text , audio , video | Text | >1,000,000 | $2.50/1M input $1.25/1M output |
info
1M
represents 1 Million Tokens- All prices listed are based on 1 Million Tokens
Availability
Cloud Provider
Usage
- curl
- python
- javascript
curl -X POST https://api.ai.it.ufl.edu/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_TOKEN>" \
-d '{
"model": "gemini-2.5-flash",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Write a haiku about an Alligator."
}
]
}'
from openai import OpenAI
client = OpenAI(
api_key="your_api_key",
base_url="https://api.ai.it.ufl.edu/v1"
)
response = client.chat.completions.create(
model="gemini-2.5-flash", # model to send to the proxy
messages = [
{ role: "system", content: "You are a helpful assistant." },
{
"role": "user",
"content": "Write a haiku about an Alligator."
}
]
)
print(response.choices[0].message)
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: 'your_api_key',
baseURL: 'https://api.ai.it.ufl.edu/v1'
});
const completion = await openai.chat.completions.create({
model: "gemini-2.5-flash",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{
role: "user",
content: "Write a haiku about an Alligator.",
},
],
});
print(completion.choices[0].message)
When to Use
- Premium reasoning and coding capabilities
- Handling extremely long contexts
- Best-in-class coding model
- Autonomous, multi-step workflows
- Multimodal input support
References
- Google Gemini 2.5 Flash Report Card
https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash- Google Gemini 2.5 Report
https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf