Seedance 2.0 is here - create consistent, multimodal AI videos faster with images, videos, and audio in one prompt.

Try Now
Skip to main content
Available now on ModelsLab · Language Model

Gpt-oss-20bgpt-oss-20b Open Reasoning

Gpt-oss-20b

Deploy gpt oss 20b model

MoE Efficiency

21B Total 3.6B Active

Activates 3.6B parameters per token from 21B total via top-4 routing from 32 experts.

Agentic Workflows

Native Tool Calling

Supports function calling and external tools for multi-step reasoning tasks.

Configurable Depth

Low Mid High Reasoning

Adjust reasoning effort in prompts for speed-accuracy tradeoffs.

Examples

See what Gpt-oss-20b can create

Copy any prompt below and try it yourself in the playground.

Code Analysis

Analyze this Python function for bugs and suggest optimizations: def fibonacci(n): if n <= 1: return n else: return fibonacci(n-1) + fibonacci(n-2). Reasoning: high.

Scientific Summary

Summarize quantum entanglement basics and implications for computing. Use structured output with key facts, equations, and applications. Reasoning: medium.

Tool Chain

Plan steps to fetch weather data via API, analyze trends, and plot results. Call tools as needed. Reasoning: high.

Math Proof

Prove Pythagorean theorem using similar triangles. Output chain-of-thought steps and diagram description. Reasoning: high.

For Developers

A few lines of code.
gpt-oss-20b. One API call.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

  • Serverless: scales to zero, scales to millions
  • Pay per token, no minimums
  • Python and JavaScript SDKs, plus REST API
import requests
response = requests.post(
"https://modelslab.com/api/v7/llm/chat/completions",
json={
"key": "YOUR_API_KEY",
"prompt": "",
"model_id": ""
}
)
print(response.json())

FAQ

Common questions about Gpt-oss-20b

Read the docs

gpt-oss-20b is OpenAI's 21B parameter MoE model with 3.6B active per token. Runs on 16GB VRAM for low-latency reasoning. Matches o3-mini benchmarks.

Uses Mixture-of-Experts with 32 experts and top-4 routing. Supports 128K context window. Optimized for agentic tasks and structured outputs.

Ideal for local inference, edge devices, and specialized use cases. Includes native tool use and configurable reasoning levels.

Yes, open-weight under Apache 2.0. Outperforms similar open models on reasoning while using less compute.

Requires ~16GB GPU VRAM. Delivers high throughput on single H100 or consumer hardware.

Supports built-in and user tools for agentic workflows. Handles multi-turn interactions reliably.

Ready to create?

Start generating with Gpt-oss-20b on ModelsLab.