Gemini 2.5 Flash

Approved Data Classifications

Description

Gemini 2.5 Flash is a highly efficient multimodal AI model developed by Google, designed to excel in automated tasks and low latency scenarios. Released on March 20, 2025, this model features a context window of over 1 million tokens featuring thinking capabilities for the first time in a Google flash model. This allows the model to process vast datasets and handle complex problems from different information sources, including text, audio, images, video and even entire code repositories.

Capabilities

Model	Release Date	Input	Output	Context Length	Cost (per 1 million tokens)
gemini-2.5-flash	Mar 20 2025	`Text`, `Image`, `Audio`, `Video`, `PDF`	`Text`	1,048,576	$2.50/1M input $1.25/1M output

info

1M represents 1 Million Tokens
All prices listed are based on 1 Million Tokens

Availability

Cloud Provider

Usage

curl
python
javascript

curl -X POST https://api.ai.it.ufl.edu/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_TOKEN>" \
-d '{
    "model": "gemini-2.5-flash",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "Write a haiku about an Alligator."
        }
    ]
}'

from openai import OpenAI
  client = OpenAI(
      api_key="your_api_key",
      base_url="https://api.ai.it.ufl.edu/v1"
  )

  response = client.chat.completions.create(
      model="gemini-2.5-flash", # model to send to the proxy
      messages = [
          { role: "system", content: "You are a helpful assistant." },
          {
              "role": "user",
              "content": "Write a haiku about an Alligator."
          }
      ]
  )

  print(response.choices[0].message)

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'your_api_key',
  baseURL: 'https://api.ai.it.ufl.edu/v1'
});

const completion = await openai.chat.completions.create({
    model: "gemini-2.5-flash",
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        {
            role: "user",
            content: "Write a haiku about an Alligator.",
        },
    ],
});

print(completion.choices[0].message)

When to Use

Premium reasoning and coding capabilities
Handling extremely long contexts
Best-in-class coding model
Autonomous, multi-step workflows
Multimodal input support

References

Google Gemini 2.5 Flash Report Card
https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash

Google Gemini 2.5 Report
https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf

Approved Data Classifications​

Description​

Capabilities​

Availability​

Cloud Provider​

Usage​

When to Use​