Arcee AI: Trinity Mini
Efficient MoE Reasoning
Run Agents Seamlessly
Sparse MoE
3B Active Params
26B model activates 3B per token from 128 experts for low-latency inference.
Long Context
131K Token Window
Handles extended inputs with strong utilization for grounded multi-turn responses.
Tool Calling
Reliable Function Use
Delivers schema-true JSON and agent recovery in Arcee AI: Trinity Mini API.
Examples
See what Arcee AI: Trinity Mini can create
Copy any prompt below and try it yourself in the playground.
Code Review
“Review this Python function for bugs and suggest optimizations: def fibonacci(n): if n <= 1: return n else: return fibonacci(n-1) + fibonacci(n-2)”
JSON Schema
“Generate valid JSON for user profile schema with name, email, age over 18, and preferences array.”
Agent Workflow
“Plan multi-step task: fetch weather API for NYC, compare to Tokyo, output summary in table format.”
Document Summary
“Summarize key points from this 10K token RAG document on quantum computing advancements.”
For Developers
A few lines of code.
Reasoning. Few lines.
ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.
- Serverless: scales to zero, scales to millions
- Pay per token, no minimums
- Python and JavaScript SDKs, plus REST API
import requestsresponse = requests.post("https://modelslab.com/api/v7/llm/chat/completions",json={"key": "YOUR_API_KEY","prompt": "","model_id": ""})print(response.json())
Ready to create?
Start generating with Arcee AI: Trinity Mini on ModelsLab.