Whisper Large v3

Approved Data Classifications

Description

Whisper Large v3 is an advanced automatic speech recognition (ASR) model developed by OpenAI. It maintains the same architecture as its predecessors but introduces improvements such as using 128 Mel frequency bins for input (up from 80) and adding a new language token for Cantonese. Trained on 1 million hours of weakly labeled audio and 4 million hours of pseudo-labeled audio, Whisper Large v3 demonstrates enhanced performance across various languages, with a 10% to 20% reduction in errors compared to Whisper Large v2. This model excels in both speech recognition and translation tasks, supporting multiple languages and showing robust generalization capabilities across different datasets and domains without requiring fine-tuning.

Capabilities

Model	Training Data	Input	Output	Context Length	Cost (per minute of audio)
whisper-large-v3	Oct 2023	`Audio`	`Text`	n/a	$0.006/minute

info

Pricing is based on one minute of audio
All prices listed are based on 1 minute of audio

Availability

Cloud Provider

Usage

curl
python
javascript

curl https://api.ai.it.ufl.edu/v1/audio/transcriptions \
-H "Authorization: Bearer <API_TOKEN>" \
-H "Content-Type: multipart/form-data" \
-F file="@/path/to/file/audio.mp3" \
-F model="whisper-large-v3"

from openai import OpenAI
  client = OpenAI(
      api_key="your_api_key",
      base_url="https://api.ai.it.ufl.edu/v1"
  )

  audio_file= open("/path/to/file/audio.mp3", "rb")
    transcription = client.audio.transcriptions.create(
      model="whisper-large-v3", 
      file=audio_file
  )
  print(transcription.text)

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'your_api_key',
  baseURL: 'https://api.ai.it.ufl.edu/v1'
});

async function main() {
  const transcription = await openai.audio.transcriptions.create({
    file: fs.createReadStream("/path/to/file/audio.mp3"),
    model: "whisper-large-v3",
  });

  console.log(transcription.text);
}
main();

Approved Data Classifications​

Description​

Capabilities​

Availability​

Cloud Provider​

Usage​