Image to Text

NaviGator Toolkit offers the capability to allow the LLM to use an image as an input to a chat conversation. This allows the LLM to analyze the image and respond to questions based on the inputted image.

Quickstart

The following example shows how to write a python script that takes a URL of an image and asks the LLM what is in the image. The image is of the seal of the University of Florida.

python

  from openai import OpenAI
  from dotenv import dotenv_values
  import os
  import time

  # Set your OpenAI API key and base URL here
  api_key = "sk-XXXXXXXX"  # Replace with your OpenAI API key
  base_url = "https://api.ai.it.ufl.edu/v1/"  # Base URL for OpenAI API

  # Initialize the OpenAI API client
  client = OpenAI(api_key=api_key, base_url=base_url)

  response = client.chat.completions.create(
      model="mistral-small-3.1",
      messages=[
          {
              "role": "user",
              "content": [
                  {"type": "text", "text": "What's in this image?"},
                  {
                      "type": "image_url",
                      "image_url": {
                          "url": "https://upload.wikimedia.org/wikipedia/en/thumb/6/6d/University_of_Florida_seal.svg/1280px-University_of_Florida_seal.svg.png"
                      }
                  },
              ],
          }
      ],
  )

  print(response.choices[0])

Quickstart​

Quickstart