Speech to Text
The Speech to Text (STT) Generation API toolkit provides intuitive endpoints for utilizing advanced models that specialize in generating text from speech. These Speech to Text generation models have been trained on extensive datasets, allowing them to interpret a variety of audio inputs to generate accurate trasncriptions.
The current model available to achieve this is whisper-large-v3
Here are some simple examples of how to call this API in both python and using curl.
- curl
- python
curl -s "https://api.ai.it.ufl.edu/v1/audio/transcriptions" \
-H 'Content-Type: multipart/form-data' \
-H "Authorization: Bearer $NAVIGATOR_TOOLKIT_API_KEY" \
-F file="@/path/to/file" \
-F model="whisper-large-v3" \
-F "language=en"
from openai import OpenAI
client = OpenAI(
base_url="https://api.ai.it.ufl.edu/v1",
api_key="$NAVIGATOR_TOOLKIT_API_KEY"
)
audio_file = open("/path/to/file", "rb")
model = "whisper-large-v3"
try:
response = client.audio.transcriptions.create(
model=model,
file=audio_file
)
except Exception as e:
print(f"Error: Some exception occurred: {e}")
exit(1)
try:
print(response.text)
except Exception as e:
print(f"Could not find text response in response object:\n {response}")