Happy Horse 1.0 is now on ModelsLab

Try Now
Skip to main content
Available now on ModelsLab · Language Model

Meta Llama 3.2 90B Vision Instruct TurboVision Meets Reasoning

Process Text and Images

Multimodal Input

Handle Images Plus Text

Accept text and images as input to generate text outputs via 90B parameters.

128K Context

Extended Token Window

Support 128,000 tokens for complex visual reasoning and long conversations.

Visual Reasoning

Analyze Charts Graphs

Extract insights from charts, graphs, and documents with image reasoning.

Examples

See what Meta Llama 3.2 90B Vision Instruct Turbo can create

Copy any prompt below and try it yourself in the playground.

Chart Analysis

Analyze this sales chart image. Identify the month with highest revenue and explain the trend.

Document QA

Examine this invoice image. Extract total amount, date, and vendor details in structured JSON.

Image Caption

Provide detailed caption for this architectural blueprint image, noting key structures and measurements.

Graph Reasoning

Review this line graph image. Summarize growth patterns and predict next quarter based on data.

For Developers

A few lines of code.
Vision inference. One call.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

  • Serverless: scales to zero, scales to millions
  • Pay per token, no minimums
  • Python and JavaScript SDKs, plus REST API
import requests
response = requests.post(
"https://modelslab.com/api/v7/llm/chat/completions",
json={
"key": "YOUR_API_KEY",
"prompt": "",
"model_id": ""
}
)
print(response.json())

FAQ

Common questions about Meta Llama 3.2 90B Vision Instruct Turbo

Read the docs

90B parameter multimodal LLM processes text and images for output. Optimized for visual recognition, reasoning, and captioning. Supports 128K context length.

Send base64-encoded images with text prompts via LLM endpoint. Outputs text responses. Handles PNG/JPG up to 5MB.

Image captioning, visual QA, chart analysis, document understanding. Competes with GPT-4o-mini on benchmarks. Includes multilingual text support.

Open-source option with strong vision performance on MMMU and MathVista. Matches leading models in image reasoning. Ready for commercial use.

Text and images in base64 for API calls. Text-only for multilingual tasks. Max response 4K tokens on-demand.

Available via LLM endpoints for inference. Supports streaming, function calling, JSON mode. Use for edge and cloud deployments.

Ready to create?

Start generating with Meta Llama 3.2 90B Vision Instruct Turbo on ModelsLab.