Happy Horse 1.0 is now on ModelsLab

Try Now
Skip to main content
Available now on ModelsLab · Language Model

Qwen: Qwen3.5-122B-A10B122B MoE Power

Run Qwen3.5-122B-A10B Now

MoE Efficiency

10B Active Parameters

122B total parameters activate 10B per token via 256 sparse experts for high capability at lower compute.

Multimodal Native

Text Image Video

Processes text, images, videos in 262K context, extensible to 1M tokens for agent workflows.

Top Benchmarks

Beats GPT-5 Mini

86.6% GPQA Diamond, 72% function calling lead open models in reasoning, coding, vision tasks.

Examples

See what Qwen: Qwen3.5-122B-A10B can create

Copy any prompt below and try it yourself in the playground.

Code Review

Analyze this Python function for bugs and optimize for speed: def fibonacci(n): if n <= 1: return n else: return fibonacci(n-1) + fibonacci(n-2). Suggest improvements with code.

Math Proof

Prove that for any integer n > 1, there exists a prime p such that p divides n! + 1. Provide step-by-step reasoning.

Agent Plan

Plan steps to deploy a web app: requirements include React frontend, Node backend, PostgreSQL DB on AWS. Output YAML workflow.

Data Analysis

Given sales data: Q1: 1200, Q2: 1500, Q3: 1100, Q4: 1800. Forecast Q5 using linear regression and plot trend.

For Developers

A few lines of code.
MoE reasoning. One call.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

  • Serverless: scales to zero, scales to millions
  • Pay per token, no minimums
  • Python and JavaScript SDKs, plus REST API
import requests
response = requests.post(
"https://modelslab.com/api/v7/llm/chat/completions",
json={
"key": "YOUR_API_KEY",
"prompt": "",
"model_id": ""
}
)
print(response.json())

FAQ

Common questions about Qwen: Qwen3.5-122B-A10B

Read the docs

Qwen: Qwen3.5-122B-A10B is Alibaba's 122B MoE LLM with 10B active parameters. Supports multimodal inputs and excels in reasoning, coding, agents. Released February 2026 under Apache 2.0.

Access via standard LLM endpoints with text, image, video inputs. Context up to 262K tokens. Providers include OpenRouter, NVIDIA NIM.

Outperforms GPT-5 mini on GPQA, function calling. Strong Qwen3.5-122B-A10B alternative to Claude Sonnet in vision, agents. Second to Qwen3.5-397B.

262K context, 65K max output, bf16/FP8 precision. 159 tokens/sec throughput, 760ms TTFT. Needs 244GB VRAM at bf16, 70GB quantized.

Leads open models in GPQA 86.6%, OCRBench 92%, ScreenSpot 70%. Hybrid DeltaNet MoE for deep reasoning, multimodal fusion.

BF16 requires 244GB VRAM (3-4x A100). Q4 quant fits 73-80GB. Scales to 1M context with YaRN.

Ready to create?

Start generating with Qwen: Qwen3.5-122B-A10B on ModelsLab.