Image to Text
NaviGator Toolkit offers the capability to allow the LLM to use an image as an input to a chat conversation. This allows the LLM to analyze the image and respond to questions based on the inputted image.
Quickstart
The following example shows how to write a python script that takes a URL of an image and asks the LLM what is in the image. The image is of the seal of the University of Florida.
- python
from openai import OpenAI
from dotenv import dotenv_values
import os
import time
# Set your OpenAI API key and base URL here
api_key = "sk-XXXXXXXX" # Replace with your OpenAI API key
base_url = "https://api.ai.it.ufl.edu/v1/" # Base URL for OpenAI API
# Initialize the OpenAI API client
client = OpenAI(api_key=api_key, base_url=base_url)
response = client.chat.completions.create(
model="mistral-small-3.1",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/en/thumb/6/6d/University_of_Florida_seal.svg/1280px-University_of_Florida_seal.svg.png"
}
},
],
}
],
)
print(response.choices[0])