Qwen: Qwen3.5 397B A17B
Frontier MoE Vision LLM
Activate 397B Intelligence Efficiently
Sparse MoE
17B Active Params
397B total parameters activate 17B per pass via 512-expert MoE routing.
Native Multimodal
Vision-Language Fusion
Handles text, images, videos with early fusion training across 201 languages.
Ultra-Efficient
8.6x Faster Decoding
Gated Delta Networks deliver 8.6x-19x speed over Qwen3-Max at 262k context.
Examples
See what Qwen: Qwen3.5 397B A17B can create
Copy any prompt below and try it yourself in the playground.
Code Review
“Review this Python function for efficiency and suggest optimizations: def fibonacci(n): if n <= 1: return n else: return fibonacci(n-1) + fibonacci(n-2)”
Tech Summary
“Summarize key specs of Qwen3.5-397B-A17B architecture, including parameter count, context length, and multimodal capabilities.”
Agent Plan
“Plan steps to deploy a vLLM server for Qwen: Qwen3.5 397B A17B model with tensor-parallel-size 8 and 262k max length.”
Reasoning Chain
“Solve: A train leaves at 60 mph, another at 70 mph from stations 200 miles apart. When do they meet? Use step-by-step reasoning.”
For Developers
A few lines of code.
Inference. Three lines.
ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.
- Serverless: scales to zero, scales to millions
- Pay per token, no minimums
- Python and JavaScript SDKs, plus REST API
import requestsresponse = requests.post("https://modelslab.com/api/v7/llm/chat/completions",json={"key": "YOUR_API_KEY","prompt": "","model_id": ""})print(response.json())
Ready to create?
Start generating with Qwen: Qwen3.5 397B A17B on ModelsLab.