Available now on ModelsLab · Language Model

Meta Llama 3.2 90B Vision Instruct Turbo
Vision Meets Reasoning

Try Meta Llama 3.2 90B Vision Instruct Turbo API Documentation

Process Text and Images

Multimodal Input

Handle Images Plus Text

Accept text and images as input to generate text outputs via 90B parameters.

128K Context

Extended Token Window

Support 128,000 tokens for complex visual reasoning and long conversations.

Visual Reasoning

Analyze Charts Graphs

Extract insights from charts, graphs, and documents with image reasoning.

Examples

See what Meta Llama 3.2 90B Vision Instruct Turbo can create

Copy any prompt below and try it yourself in the playground.

Chart Analysis

“Analyze this sales chart image. Identify the month with highest revenue and explain the trend.”

Document QA

“Examine this invoice image. Extract total amount, date, and vendor details in structured JSON.”

Image Caption

“Provide detailed caption for this architectural blueprint image, noting key structures and measurements.”

Graph Reasoning

“Review this line graph image. Summarize growth patterns and predict next quarter based on data.”

For Developers

A few lines of code.
Vision inference. One call.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

Serverless: scales to zero, scales to millions
Pay per token, no minimums
Python and JavaScript SDKs, plus REST API

API Documentation

import requests

response = requests.post(
    "https://modelslab.com/api/v7/llm/chat/completions",
    json={
  "key": "YOUR_API_KEY",
  "prompt": "",
  "model_id": ""
}
)
print(response.json())

FAQ

Common questions about Meta Llama 3.2 90B Vision Instruct Turbo

Read the docs

90B parameter multimodal LLM processes text and images for output. Optimized for visual recognition, reasoning, and captioning. Supports 128K context length.

Send base64-encoded images with text prompts via LLM endpoint. Outputs text responses. Handles PNG/JPG up to 5MB.

Image captioning, visual QA, chart analysis, document understanding. Competes with GPT-4o-mini on benchmarks. Includes multilingual text support.

Open-source option with strong vision performance on MMMU and MathVista. Matches leading models in image reasoning. Ready for commercial use.

Text and images in base64 for API calls. Text-only for multilingual tasks. Max response 4K tokens on-demand.

Available via LLM endpoints for inference. Supports streaming, function calling, JSON mode. Use for edge and cloud deployments.

Ready to create?

Start generating with Meta Llama 3.2 90B Vision Instruct Turbo on ModelsLab.

Try Meta Llama 3.2 90B Vision Instruct Turbo API Documentation

Meta Llama 3.2 90B Vision Instruct TurboVision Meets Reasoning

Process Text and Images

Handle Images Plus Text

Extended Token Window

Analyze Charts Graphs

See what Meta Llama 3.2 90B Vision Instruct Turbo can create

A few lines of code.Vision inference. One call.

Common questions about Meta Llama 3.2 90B Vision Instruct Turbo

What is Meta Llama 3.2 90B Vision Instruct Turbo?

How to use Meta Llama 3.2 90B Vision Instruct Turbo API?

What tasks does Meta Llama 3.2 90B Vision Instruct Turbo model handle?

Is Meta Llama 3.2 90B Vision Instruct Turbo alternative to Claude?

What inputs does meta llama 3.2 90b vision instruct turbo api accept?

Where to access meta llama 3.2 90b vision instruct turbo model?

Ready to create?

Meta Llama 3.2 90B Vision Instruct Turbo
Vision Meets Reasoning

A few lines of code.
Vision inference. One call.