Kokoro
Approved Data Classifications
Description
Kokoro is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Kokoro can be deployed anywhere from production environments to personal projects.
For a list of voices and languages available for this model, please see: VOICES
The above was sourced from the following model card on hugging face: https://huggingface.co/hexgrad/Kokoro-82M
Capabilities
Model | Training Data | Input | Output | Context Length | Cost (per minute of audio) |
---|---|---|---|---|---|
kokoro | Jan 2025 | Text | Audio | n/a | $0.006/minute |
info
- Pricing is based on one minute of audio
- All prices listed are based on 1 minute of audio
Availability
Cloud Provider
Usage
- curl
- python
curl https://api.ai.it.ufl.edu/v1/audio/speech \
-H "Authorization: Bearer <API_TOKEN>" \
-H 'Content-Type: application/json' \
-d "{ \
"model": "kokoro", \
"input": "I am an AI assistant here to help", \
"voice": "af_heart", \
"response_format": "mp3", \
"speed": 1.0
}" \
--output output.mp3
from openai import OpenAI
outputFile = "output.mp3"
client = OpenAI(
api_key="your_api_key",
base_url="https://api.ai.it.ufl.edu/v1"
)
with client.audio.speech.with_streaming_response.create(
model="kokoro",
voice="af_heart", #single or multiple voicepack combo
input="I am an AI assistant here to help.",
speed="1.0"
) as response:
response.stream_to_file(outputFile)