llava-1.5-7b-hf
Model ID: @cf/llava-hf/llava-1.5-7b-hf
LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture.
 Properties
Task Type: Image-to-Text
 Code Examples
Workers - TypeScript
export interface Env {  AI: Ai;
}
export default {  async fetch(request: Request, env: Env): Promise<Response> {    const res: any = await fetch("https://cataas.com/cat");    const blob = await res.arrayBuffer();    const input = {      image: [...new Uint8Array(blob)],      prompt: "Generate a caption for this image",      max_tokens: 512,    };    const response = await env.AI.run(      "@cf/llava-hf/llava-1.5-7b-hf",      input      );    return new Response(JSON.stringify(response));  },
} satisfies ExportedHandler<Env>;
 API Schema
The following schema is based on JSON SchemaInput JSON Schema
Output JSON Schema