Text to Speech

The Text to Speech (TTS) Generation API provides an endpoint that can be used to convert entered text into audio.

The current model available to achieve this is kokoro

Here are some simple examples of how to call this API in both python and using curl.

curl
python

curl -s "https://api.ai.it.ufl.edu/v1/audio/speech" \
     -H 'Content-Type: application/json' \
     -H "Authorization: Bearer $NAVIGATOR_TOOLKIT_API_KEY" \
     -d "{ \
           \"model\": \"kokoro\", \
           \"input\": \"Hello. I am Navigator Toolkit. Nice to meet you.\" \
           \"voice\": \"af_heart\", \
           \"response_format\": \"mp3\", \
           \"speed\": 1.0
     } \
     --output out.mp3

from openai import OpenAI

client = OpenAI(
    base_url="https://api.ai.it.ufl.edu/v1",
    api_key="$NAVIGATOR_TOOLKIT_API_KEY"
)

model = "kokoro"
voice = "af_heart"
prompt = "Hello. I am Navigator Toolkit. Nice to meet you."
speed = 1.0
output = "output.mp3"

response = None

try:
    response = client.audio.speech.create(
        model=model,
        voice=voice,
        input=prompt,
        response_format="mp3",
        speed=speed
    )
except Exception as e:
    print(f"Error: Some exception occurred during creation request: {e}")
    exit(1)

try:
    response.write_to_file(output)
except Exception as e:
    print(f"Failed to write output to file: {e}")