Skip to main content

Speech to Text

The Speech to Text (STT) Generation API toolkit provides intuitive endpoints for utilizing advanced models that specialize in generating text from speech. These Speech to Text generation models have been trained on extensive datasets, allowing them to interpret a variety of audio inputs to generate accurate trasncriptions.

The current model available to achieve this is whisper-large-v3

Here are some simple examples of how to call this API in both python and using curl.

curl -s "https://api.ai.it.ufl.edu/v1/audio/transcriptions" \
-H 'Content-Type: multipart/form-data' \
-H "Authorization: Bearer $NAVIGATOR_TOOLKIT_API_KEY" \
-F file="@/path/to/file" \
-F model="whisper-large-v3" \
-F "language=en"