Seeing
Add vision to your app. Simply send an image to the API and get back a description of what's in the image. You can customize the prompts in order to get responses that are more tailored to your use case.
The API only accepts images which are valid base64 encoded data URI's.
API Reference
POST https://api.geppetto.app/see
Request Body
image
fileURI
Required
The image to see. Must be a valid Base64 Data URI. Max size 25MB.
Accepts: jpg
, png
, gif
, webp
, bmp
, svg
, and heic
prompt
string
Optional
Default: Describe this image in detail
The question about the image
system_prompt
string
Optional
System prompt to give the model context or instructions
stream
boolean
Optional
Default: false
If the response should be streamed or not
temperature
number
Optional
Default: 0.2
The temperature of the model. Vision models tend to perform better with lower temperatures
max_tokens
number
Optional
Default: 200
The maximum number of tokens to generate
presence_penalty
number
Optional
Default: 0
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
frequency_penalty
number
Optional
Default: 0
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
top_p
number
Optional
Default: 1
The cumulative probability of tokens to generate
Returns
If stream is false, the response will be the following JSON object.
{
"content": string
}
When stream is true, the response will be streamed with Transfer-Encoding: chunked
. Each chunk will be the following JSON object.
{
"content": string,
"stop": boolean
}