Why Add Voice to Your Mastra Agent?
Mastra is one of the fastest-growing TypeScript AI agent frameworks — 21,000 GitHub stars and climbing. Out of the box it handles memory, tools, and workflows. What it hasn't had until now: a first-class TTS (text-to-speech) integration with ModelsLab's voice API.
The new @mastra/voice-modelslab provider (PR #13627, merged March 2026) follows the same pattern as the ElevenLabs, Murf, and Deepgram integrations — drop-in, typed, and ready in under 10 minutes.
This tutorial walks you through adding real-time voice output to any Mastra agent using ModelsLab TTS.
Prerequisites
- Node.js 18+
- An existing Mastra project (or
npx create-mastra@latest) - A ModelsLab API key (free tier available)
Install the Provider
npm install @mastra/voice-modelslab# orpnpm add @mastra/voice-modelslab
Basic Setup: Add Voice to an Agent
Mastra's voice system is built on a common MastraVoice interface. The ModelsLab provider implements speak() and getSpeakers() — the same API as ElevenLabs and Deepgram, so switching providers requires zero agent rewrites.
import { Agent } from "@mastra/core/agent";import { ModelsLabVoice } from "@mastra/voice-modelslab";const voice = new ModelsLabVoice({apiKey: process.env.MODELSLAB_API_KEY!, // get at modelslab.comspeaker: "en-US-AriaNeural", // optional: defaults to en-US-JennyNeural});
const agent = new Agent({name: "VoiceAssistant",instructions: "You are a helpful assistant that responds concisely.",model: anthropic("claude-3-5-sonnet-20241022"),voice, // attach the voice provider here});
Generating Speech from Agent Output
Once voice is attached, call agent.speak() to convert any string to an audio stream:
import { createWriteStream } from "fs";import { pipeline } from "stream/promises";// Generate a text response from the agentconst response = await agent.generate("Explain what a REST API is in 2 sentences.");const text = response.text;,[object Object],
if (audioStream) {// Save to fileawait pipeline(audioStream, createWriteStream("output.mp3"));console.log("Audio saved to output.mp3");}
Real-Time Voice Workflows
The ModelsLab provider supports Mastra's streaming workflow architecture. Here's a pattern for an agent that reads its responses aloud during generation:
import { Mastra } from "@mastra/core";import { ModelsLabVoice } from "@mastra/voice-modelslab";const mastra = new Mastra({agents: { voiceAgent: agent },});,[object Object],,[object Object],
return {text: result.text,audioStream: stream,};}
Listing Available Speakers
ModelsLab's TTS API supports dozens of neural voices. Fetch them dynamically:
const speakers = await voice.getSpeakers();console.log(speakers);// [// { voiceId: "en-US-AriaNeural", name: "Aria (US English)" },// { voiceId: "en-US-GuyNeural", name: "Guy (US English)" },// { voiceId: "en-GB-SoniaNeural", name: "Sonia (British English)" },// ... and more// ]
Environment Variables
The provider reads your API key from MODELSLAB_API_KEY. Add this to your .env:
MODELSLAB_API_KEY=your_key_here
Get your key at modelslab.com/dashboard — the free tier includes enough credits to prototype your voice agent.
Why ModelsLab for Mastra TTS?
There are several TTS providers available in Mastra. Here's how ModelsLab compares:
- Price: ModelsLab's TTS API is among the most cost-effective for high-volume applications — pay-per-character, no monthly minimums.
- Latency: Designed for API-first workloads, with response times tuned for agentic systems (not just consumer UIs).
- Models: Access to multiple voice synthesis models from a single API key — same key you use for image, video, and LLM APIs.
- No lock-in: Mastra's common
MastraVoiceinterface means you can swap providers without changing your agent code.
Full Example: A Narrating Research Agent
Put it all together: an agent that researches a topic and narrates its findings:
import { Agent } from "@mastra/core/agent";import { ModelsLabVoice } from "@mastra/voice-modelslab";import { createWriteStream } from "fs";import { pipeline } from "stream/promises";import Anthropic from "@anthropic-ai/sdk";const anthropic = new Anthropic();,[object Object],,[object Object],,[object Object],,[object Object],,[object Object],
narrateTopic("how REST APIs work");
Troubleshooting
Audio stream is empty
Check that your MODELSLAB_API_KEY is valid and has TTS credits. Test it directly:
curl -X POST https://modelslab.com/api/v6/voice/text_to_audio \-H "Content-Type: application/json" \-d '{"key":"YOUR_KEY","prompt":"Hello from ModelsLab","language":"en","voice_id":"en-US-AriaNeural"}'
TypeScript errors on agent.voice
Make sure you're on Mastra 0.10.0+ (when this provider ships). The voice property is typed as MastraVoice | undefined — use optional chaining (agent.voice?.speak()).
Next Steps
Now that your Mastra agent can speak, explore what else ModelsLab's API can do:
- Text generation — pair TTS with uncensored LLM responses
- Image generation — build agents that both describe and generate visuals
- Full API docs — 200+ models, one API key
ModelsLab's API is built for exactly this kind of multi-modal agentic work. Get your free API key and start building.
