Claude 4.5 Haiku

Approved Data Classifications

Description

Claude Haiku 4.5 is Anthropic’s fastest generally available model, with near-frontier intelligence for low-latency and cost-efficient work. It supports coding, computer use, and agent tasks, and is priced at $1 per million input tokens and $5 per million output tokens.

Capabilities

Model	Knowledge Cutoff	Input	Output	Context Length	Cost (per 1 million tokens)
claude-4.5-haiku	Feb 28 2025	`Text`, `Image`, `Pdf`	`Text`	200,000	$1.10/1M input $5.50/1M output

info

1M represents 1 Million Tokens
All prices listed are based on 1 Million Tokens

Availability

Cloud Provider

Usage

curl
python
javascript

curl -X POST https://api.ai.it.ufl.edu/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_TOKEN>" \
-d '{
    "model": "claude-4.5-haiku",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "Write a haiku about an Alligator."
        }
    ]
}'

from openai import OpenAI
  client = OpenAI(
      api_key="your_api_key",
      base_url="https://api.ai.it.ufl.edu/v1"
  )

  response = client.chat.completions.create(
      model="claude-4.5-haiku", # model to send to the proxy
      messages = [
          { role: "system", content: "You are a helpful assistant." },
          {
              "role": "user",
              "content": "Write a haiku about an Alligator."
          }
      ]
  )

  print(response.choices[0].message)

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'your_api_key',
  baseURL: 'https://api.ai.it.ufl.edu/v1'
});

const completion = await openai.chat.completions.create({
    model: "claude-4.5-haiku",
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        {
            role: "user",
            content: "Write a haiku about an Alligator.",
        },
    ],
});

print(completion.choices[0].message)

When to use

Low-latency chat and customer support
High-throughput agentic sub-tasks
Cost-efficient coding assistance
Real-time computer-use or workflow automation
Large-scale summarization and extraction
General-purpose assistants where speed matters

References

Anthropic
https://www.anthropic.com

LLM Stats
https://llm-stats.com

Artificial Analysis
https://artificialanalysis.ai

Approved Data Classifications​

Description​

Capabilities​

Availability​

Cloud Provider​

Usage​

When to use​