---
title: Nemotron 3 Super 120B — Agentic Reasoning | ModelsLab
description: Access Nvidia Nemotron 3 Super 120B A12b Bf16 via API for efficient agentic reasoning with 1M context and 2x throughput. Deploy now for coding and plann...
url: https://modelslab.com/nvidia-nemotron-3-super-120b-a12b-bf16
canonical: https://modelslab.com/nvidia-nemotron-3-super-120b-a12b-bf16
type: website
component: Seo/ModelPage
generated_at: 2026-04-19T21:50:30.454369Z
---

Available now on ModelsLab · Language Model

Nvidia Nemotron 3 Super 120B A12b Bf16
Agentic Reasoning Supercharged
---

[Try Nvidia Nemotron 3 Super 120B A12b Bf16](/models/together_ai/nvidia-NVIDIA-Nemotron-3-Super-120B-A12B-BF16) [API Documentation](https://docs.modelslab.com)

Scale Intelligence Efficiently
---

Hybrid Architecture

### Mamba-Transformer MoE

Activates 12B of 120B parameters for 2.2x throughput over GPT-OSS-120B on B200 GPUs.

Long Context

### 1M Token Window

Handles extended sequences for multi-step planning and cross-document reasoning.

Optimized Precision

### NVFP4 to Bf16

Pretrained in NVFP4, post-trained in Bf16 for 4x inference speed on Blackwell.

Examples

See what Nvidia Nemotron 3 Super 120B A12b Bf16 can create
---

Copy any prompt below and try it yourself in the [playground](/models/together_ai/nvidia-NVIDIA-Nemotron-3-Super-120B-A12B-BF16).

Code Generation

“Write a Python function to parse JSON logs, extract error rates, and generate a summary report with visualizations using matplotlib. Include error handling and support for large files.”

Task Planning

“Plan a multi-step cybersecurity triage workflow: analyze network logs, identify anomalies, prioritize threats, and recommend mitigation steps with tool calls.”

Math Reasoning

“Solve this AIME-level problem: Find the number of integer solutions to x^2 + y^2 + z^2 = 2025 where x, y, z are positive integers up to 50. Explain each step.”

Agent Workflow

“Design an autonomous agent script for software development: generate unit tests, run them via subprocess, fix failures iteratively, and output refactored code.”

For Developers

A few lines of code.
Reasoning. One API call.
---

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

- **Serverless:** scales to zero, scales to millions
- **Pay per token,** no minimums
- **Python and JavaScript SDKs,** plus REST API

[API Documentation ](https://docs.modelslab.com)

PythonJavaScriptcURL

Copy

```
<code>import requests

response = requests.post(
    "https://modelslab.com/api/v7/llm/chat/completions",
    json={
  "key": "YOUR_API_KEY",
  "prompt": "",
  "model_id": ""
}
)
print(response.json())</code>
```

FAQ

Common questions about Nvidia Nemotron 3 Super 120B A12b Bf16
---

[Read the docs ](https://docs.modelslab.com)

### What is Nvidia Nemotron 3 Super 120B A12b Bf16?

120B total, 12B active-parameter hybrid Mamba-Transformer MoE model. Pretrained on 25T tokens with NVFP4, post-trained in Bf16. Excels in agentic tasks like coding and planning.

### How does Nvidia Nemotron 3 Super 120B A12b Bf16 API perform?

Delivers 2.2x throughput vs GPT-OSS-120B and 7.5x vs Qwen3.5-122B on 8k input/64k output. Supports 1M context on B200 GPUs with vLLM/TRT-LLM.

### Is Nvidia Nemotron 3 Super 120B A12b Bf16 model open?

Fully open under NVIDIA Open License with weights, datasets, and recipes. Customize for secure deployment from workstation to cloud.

### What makes nvidia nemotron 3 super 120b a12b bf16 api efficient?

Latent MoE calls 4x experts at one cost. Multi-token prediction speeds generation. Hybrid backbone cuts memory by 4x on Blackwell vs H100 FP8.

### Nvidia Nemotron 3 Super 120B A12b Bf16 alternative to what?

Outperforms GPT-OSS-120B in intelligence and throughput. Beats Qwen3.5-122B on efficiency despite similar intelligence. Leads open models on PinchBench at 85.6%.

### Where to access nvidia nemotron 3 super 120b a12b bf16 model?

Available via ModelsLab LLM endpoint. Use Bf16 weights for post-training accuracy. Integrates with NVIDIA NeMo for RL fine-tuning.

Ready to create?
---

Start generating with Nvidia Nemotron 3 Super 120B A12b Bf16 on ModelsLab.

[Try Nvidia Nemotron 3 Super 120B A12b Bf16](/models/together_ai/nvidia-NVIDIA-Nemotron-3-Super-120B-A12B-BF16) [API Documentation](https://docs.modelslab.com)

---

*This markdown version is optimized for AI agents and LLMs.*

**Links:**
- [Website](https://modelslab.com)
- [API Documentation](https://docs.modelslab.com)
- [Blog](https://modelslab.com/blog)

---
*Generated by ModelsLab - 2026-04-20*