Available now on ModelsLab · Language Model

Qwen: Qwen3.5 397B A17B
Frontier MoE Vision LLM

Try Qwen: Qwen3.5 397B A17B API Documentation

Activate 397B Intelligence Efficiently

Sparse MoE

17B Active Params

397B total parameters activate 17B per pass via 512-expert MoE routing.

Native Multimodal

Vision-Language Fusion

Handles text, images, videos with early fusion training across 201 languages.

Ultra-Efficient

8.6x Faster Decoding

Gated Delta Networks deliver 8.6x-19x speed over Qwen3-Max at 262k context.

Examples

See what Qwen: Qwen3.5 397B A17B can create

Copy any prompt below and try it yourself in the playground.

Code Review

“Review this Python function for efficiency and suggest optimizations: def fibonacci(n): if n <= 1: return n else: return fibonacci(n-1) + fibonacci(n-2)”

Tech Summary

“Summarize key specs of Qwen3.5-397B-A17B architecture, including parameter count, context length, and multimodal capabilities.”

Agent Plan

“Plan steps to deploy a vLLM server for Qwen: Qwen3.5 397B A17B model with tensor-parallel-size 8 and 262k max length.”

Reasoning Chain

“Solve: A train leaves at 60 mph, another at 70 mph from stations 200 miles apart. When do they meet? Use step-by-step reasoning.”

For Developers

A few lines of code.
Inference. Three lines.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

Serverless: scales to zero, scales to millions
Pay per token, no minimums
Python and JavaScript SDKs, plus REST API

API Documentation

import requests

response = requests.post(
    "https://modelslab.com/api/v7/llm/chat/completions",
    json={
  "key": "YOUR_API_KEY",
  "prompt": "",
  "model_id": ""
}
)
print(response.json())

FAQ

Common questions about Qwen: Qwen3.5 397B A17B

Read the docs

Open-weight vision-language LLM with 397B total params, 17B active via sparse MoE. Supports 262k native context, extensible to 1M. Competes with GPT-5.2 on reasoning, coding, agents.

Call via standard LLM endpoints with multimodal inputs. Use vLLM: vllm serve Qwen/Qwen3.5-397B-A17B --tensor-parallel-size 8. Enables tool use and 1M context in hosted versions.

Hybrid Gated DeltaNet + MoE activates 17B params per pass. Achieves 8.6x faster decoding than Qwen3-Max. Lower compute than 1T peers while ranking #3 open-weights.

Yes, native image/video input via early fusion. First Qwen open model unifying text and vision. Outperforms prior Qwen3-VL on visual benchmarks.

Matches frontier like Claude 4.5, Gemini-3 Pro on intelligence index. Open-weights under Apache 2.0 with 201 languages. Smaller active params than Kimi K2.5 or GLM-5.

262k tokens native, up to 1M extensible via YaRN. Supports reasoning/non-reasoning modes in one model.

Ready to create?

Start generating with Qwen: Qwen3.5 397B A17B on ModelsLab.

Try Qwen: Qwen3.5 397B A17B API Documentation

Qwen: Qwen3.5 397B A17BFrontier MoE Vision LLM

Activate 397B Intelligence Efficiently

17B Active Params

Vision-Language Fusion

8.6x Faster Decoding

See what Qwen: Qwen3.5 397B A17B can create

A few lines of code.Inference. Three lines.

Common questions about Qwen: Qwen3.5 397B A17B

What is Qwen: Qwen3.5 397B A17B model?

How to use Qwen: Qwen3.5 397B A17B API?

What makes Qwen: Qwen3.5 397B A17B efficient?

Does Qwen qwen3 5 397b a17b api support vision?

Is Qwen: Qwen3.5 397B A17B alternative to closed models?

Qwen qwen3 5 397b a17b model context length?

Ready to create?

Qwen: Qwen3.5 397B A17B
Frontier MoE Vision LLM

A few lines of code.
Inference. Three lines.