Skip to main content

Whisper Large v3

Approved Data Classifications

Description

Whisper Large v3 is OpenAI's speech recognition and translation model. It uses 128 Mel frequency bins, adds a Cantonese language token, and improves over Whisper Large v2 across multiple languages.

Capabilities

ModelKnowledge CutoffInputOutputContext LengthCost (per minute of audio)
whisper-large-v3Oct 2023AudioTextn/a$0.006/minute
info
  • Pricing is based on one minute of audio
  • All prices listed are based on 1 minute of audio

Availability

Cloud Provider

Usage

curl https://api.ai.it.ufl.edu/v1/audio/transcriptions \
-H "Authorization: Bearer <API_TOKEN>" \
-H "Content-Type: multipart/form-data" \
-F file="@/path/to/file/audio.mp3" \
-F model="whisper-large-v3"

References

  1. OpenAI
    https://openai.com/
  2. LLM Stats
    https://llm-stats.com
  3. Artificial Analysis
    https://artificialanalysis.ai