llava-1.5-7b-hf

Beta

Model ID: @cf/llava-hf/llava-1.5-7b-hf

LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture.

Properties

Task Type: Image-to-Text

Code Examples

Workers - TypeScript

export interface Env {  AI: Ai;
}

export default {  async fetch(request: Request, env: Env): Promise<Response> {    const res: any = await fetch("https://cataas.com/cat");    const blob = await res.arrayBuffer();    const input = {      image: [...new Uint8Array(blob)],      prompt: "Generate a caption for this image",      max_tokens: 512,    };    const response = await env.AI.run(      "@cf/llava-hf/llava-1.5-7b-hf",      input      );    return new Response(JSON.stringify(response));  },
} satisfies ExportedHandler<Env>;

API Schema

The following schema is based on JSON Schema

Input JSON Schema

{
"oneOf": [  {    "type": "string",    "format": "binary"  },  {    "type": "object",    "properties": {      "image": {        "type": "array",        "items": {          "type": "number"        }      },      "prompt": {        "type": "string"      },      "max_tokens": {        "type": "integer",        "default": 512      }    }  }
]
}

Output JSON Schema

{
"type": "object",
"contentType": "application/json",
"properties": {  "description": {    "type": "string"  }
}
}

llava-1.5-7b-hf

​​ Properties

​​ Code Examples

​​ API Schema

Properties

Code Examples

API Schema