Whisper Large v3
Approved Data Classifications
Description
Whisper Large v3 is OpenAI's speech recognition and translation model. It uses 128 Mel frequency bins, adds a Cantonese language token, and improves over Whisper Large v2 across multiple languages.
Capabilities
| Model | Knowledge Cutoff | Input | Output | Context Length | Cost (per minute of audio) |
|---|---|---|---|---|---|
| whisper-large-v3 | Oct 2023 | Audio | Text | n/a | $0.006/minute |
info
- Pricing is based on one minute of audio
- All prices listed are based on 1 minute of audio
Availability
Cloud Provider
Usage
- curl
- python
- javascript
curl https://api.ai.it.ufl.edu/v1/audio/transcriptions \
-H "Authorization: Bearer <API_TOKEN>" \
-H "Content-Type: multipart/form-data" \
-F file="@/path/to/file/audio.mp3" \
-F model="whisper-large-v3"
from openai import OpenAI
client = OpenAI(
api_key="your_api_key",
base_url="https://api.ai.it.ufl.edu/v1"
)
audio_file= open("/path/to/file/audio.mp3", "rb")
transcription = client.audio.transcriptions.create(
model="whisper-large-v3",
file=audio_file
)
print(transcription.text)
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: 'your_api_key',
baseURL: 'https://api.ai.it.ufl.edu/v1'
});
async function main() {
const transcription = await openai.audio.transcriptions.create({
file: fs.createReadStream("/path/to/file/audio.mp3"),
model: "whisper-large-v3",
});
console.log(transcription.text);
}
main();
References
- OpenAI
https://openai.com/- LLM Stats
https://llm-stats.com- Artificial Analysis
https://artificialanalysis.ai