How do I use the Llama 4 Maverick Instruct (17Bx128E) API?

You can integrate Llama 4 Maverick Instruct (17Bx128E) into your application with a single API call. Sign up on ModelsLab to get your API key, then use the model ID "meta-llama-Llama-4-Maverick-17B-128E-Instruct-FP8" in your API requests. We provide SDKs for Python, JavaScript, and cURL examples in the API documentation.

How much does Llama 4 Maverick Instruct (17Bx128E) cost?

Llama 4 Maverick Instruct (17Bx128E) costs $0.560000 per million tokens. ModelsLab uses pay-per-use pricing with no minimum commitments. A free tier is available to get started.

Does Llama 4 Maverick Instruct (17Bx128E) have a free tier?

Yes, ModelsLab offers a free tier that lets you try Llama 4 Maverick Instruct (17Bx128E) and other AI models. Sign up to get free API credits and start building immediately.

Llama 4 Maverick Instruct (17Bx128E)

meta-llama-Llama-4-Maverick-17B-128E-Instruct-FP8metaClosed Source Model$0.560000 / call

Llama 4 Maverick Instruct (17Bx128E)

Choose a prompt below to get started or type your own message

Related Models

Discover similar models you might be interested in

View all LLM Models

open_router/MythoMax 13B

gryphe-mythomax-l2-13b

From $0.06/M tokens

together_ai/Mixtral 8x7B Instruct V0.1 FP8 Lora

mistralai-Mixtral-8x7B-Instruct-v0.1-FP8-Lora

Free

together_ai/Facebook CWM

facebook-cwm

Free

open_router/Qwen: Qwen3 VL 30B A3B Instruct

qwen-qwen3-vl-30b-a3b-instruct

From $0.33/M tokens

google/Gemini 2.0 Flash

gemini-2.0-flash-001

From $0.25/M tokens

open_router/Reka Flash 3

rekaai-reka-flash-3

From $0.15/M tokens

open_router/Meta Llama 3 8B Instruct Reference

meta-llama-Llama-3-8b-chat-hf

From $0.04/M tokens

together_ai/Trinity Mini

arcee-ai-trinity-mini

From $0.10/M tokens

open_router/IBM: Granite 4.0 Micro

ibm-granite-granite-4.0-h-micro

From $0.06/M tokens

together_ai/nim/mistralai/mixtral-8x7b-instruct-v01

nim-mistralai-mixtral-8x7b-instruct-v01

Free

together_ai/nim/meta/llama-3.1-8b-instruct

nim-meta-llama-3.1-8b-instruct

Free

together_ai/Mistral (7B) Instruct v0.1

mistralai-Mistral-7B-Instruct-v0.1

From $0.20/M tokens

open_router/Google: Gemini 3.1 Flash Lite Preview

google-gemini-3.1-flash-lite-preview

From $0.88/M tokens

open_router/Meta: Llama 4 Maverick

meta-llama-llama-4-maverick

From $0.38/M tokens

open_router/Mistral: Mistral Small 4

mistralai-mistral-small-2603

From $0.38/M tokens

open_router/Qwen: Qwen3 8B

qwen-qwen3-8b

From $0.23/M tokens

together_ai/Magistral Small 2506

mistralai-Magistral-Small-2506

Free

open_router/ByteDance Seed: Seed 1.6 Flash

bytedance-seed-seed-1.6-flash

From $0.19/M tokens

About Llama 4 Maverick Instruct (17Bx128E)

Multimodal 17B active parameter (400B total) MoE LLM with 128 experts, native image support, 1M token context, excels in coding, multilingual tasks, and enterprise document intelligence.

Technical Specifications

Model ID: meta-llama-Llama-4-Maverick-17B-128E-Instruct-FP8
Category: LLM Models
Task: Text Generation
Price: $0.56 per million tokens
Added: July 22, 2025

Key Features

Chat completion and multi-turn conversation API
Streaming response with token-by-token output
Function calling and tool use support
System prompts and role-based messaging
JSON mode and structured output

Quick Start

Integrate Llama 4 Maverick Instruct (17Bx128E) into your application with a single API call. Get your API key from the pricing page to get started.

import requests
import json

url = "https://modelslab.com/api/v7/llm/chat/completions"

headers = {
    "Content-Type": "application/json"
}

data = {
        "model_id": "meta-llama-Llama-4-Maverick-17B-128E-Instruct-FP8",
        "messages": [
            {
                "role": "user",
                "content": "Hello!"
            }
        ],
        "max_tokens": 1000,
        "key": "YOUR_API_KEY"
    }

try:
    response = requests.post(url, headers=headers, json=data)
    response.raise_for_status()  # Raises an HTTPError for bad responses (4XX or 5XX)
    result = response.json()
    print("API Response:")
    print(json.dumps(result, indent=2))
except requests.exceptions.HTTPError as http_err:
    print(f"HTTP error occurred: {http_err} - {response.text}")
except Exception as err:
    print(f"Other error occurred: {err}")

Pricing

Llama 4 Maverick Instruct (17Bx128E) API costs $0.560000 per million tokens. Pay only for what you use with no minimum commitments. View pricing plans

Use Cases

AI chatbots and virtual assistants
Code generation and developer tools
Content writing and copywriting automation
Data analysis, summarization, and extraction

Learn more about Llama 4 Maverick Instruct (17Bx128E)Browse LLM Models More from Meta View Pricing

Llama 4 Maverick Instruct (17Bx128E)

Llama 4 Maverick Instruct (17Bx128E)

Related Models

open_router/MythoMax 13B

together_ai/Mixtral 8x7B Instruct V0.1 FP8 Lora

together_ai/Facebook CWM

open_router/Qwen: Qwen3 VL 30B A3B Instruct

google/Gemini 2.0 Flash

open_router/Reka Flash 3

open_router/Meta Llama 3 8B Instruct Reference

together_ai/Trinity Mini

open_router/IBM: Granite 4.0 Micro

together_ai/nim/mistralai/mixtral-8x7b-instruct-v01

together_ai/nim/meta/llama-3.1-8b-instruct

together_ai/Mistral (7B) Instruct v0.1

open_router/Google: Gemini 3.1 Flash Lite Preview

open_router/Meta: Llama 4 Maverick

open_router/Mistral: Mistral Small 4

open_router/Qwen: Qwen3 8B

together_ai/Magistral Small 2506

open_router/ByteDance Seed: Seed 1.6 Flash

About Llama 4 Maverick Instruct (17Bx128E)

Technical Specifications

Key Features

Quick Start

Pricing

Use Cases

Llama 4 Maverick Instruct (17Bx128E) FAQ

What is Llama 4 Maverick Instruct (17Bx128E)?

How do I use the Llama 4 Maverick Instruct (17Bx128E) API?

How much does Llama 4 Maverick Instruct (17Bx128E) cost?

What is the Llama 4 Maverick Instruct (17Bx128E) model ID?

Does Llama 4 Maverick Instruct (17Bx128E) have a free tier?