Why Add Voice to Your Mastra Agent?
Mastra is one of the fastest-growing TypeScript AI agent frameworks — 21,000 GitHub stars and climbing. Out of the box it handles memory, tools, and workflows. What it hasn't had until now: a first-class TTS (text-to-speech) integration with ModelsLab's voice API.
The new @mastra/voice-modelslab provider (PR #13627, merged March 2026) follows the same pattern as the ElevenLabs, Murf, and Deepgram integrations — drop-in, typed, and ready in under 10 minutes.
This tutorial walks you through adding real-time voice output to any Mastra agent using ModelsLab TTS.
Prerequisites
- Node.js 18+
- An existing Mastra project (or
npx create-mastra@latest) - A ModelsLab API key (free tier available)
Install the Provider
npm install @mastra/voice-modelslab
# or
pnpm add @mastra/voice-modelslab
Basic Setup: Add Voice to an Agent
Mastra's voice system is built on a common MastraVoice interface. The ModelsLab provider implements speak() and getSpeakers() — the same API as ElevenLabs and Deepgram, so switching providers requires zero agent rewrites.
import { Agent } from "@mastra/core/agent";
import { ModelsLabVoice } from "@mastra/voice-modelslab";
const voice = new ModelsLabVoice({
apiKey: process.env.MODELSLAB_API_KEY!, // get at modelslab.com
speaker: "en-US-AriaNeural", // optional: defaults to en-US-JennyNeural
});
const agent = new Agent({
name: "VoiceAssistant",
instructions: "You are a helpful assistant that responds concisely.",
model: anthropic("claude-3-5-sonnet-20241022"),
voice, // attach the voice provider here
});
Generating Speech from Agent Output
Once voice is attached, call agent.speak() to convert any string to an audio stream:
import { createWriteStream } from "fs";
import { pipeline } from "stream/promises";
// Generate a text response from the agent
const response = await agent.generate("Explain what a REST API is in 2 sentences.");
const text = response.text;
// Convert to speech
const audioStream = await agent.voice?.speak(text, {
speaker: "en-US-GuyNeural", // override per-call
});
if (audioStream) {
// Save to file
await pipeline(audioStream, createWriteStream("output.mp3"));
console.log("Audio saved to output.mp3");
}
Real-Time Voice Workflows
The ModelsLab provider supports Mastra's streaming workflow architecture. Here's a pattern for an agent that reads its responses aloud during generation:
import { Mastra } from "@mastra/core";
import { ModelsLabVoice } from "@mastra/voice-modelslab";
const mastra = new Mastra({
agents: { voiceAgent: agent },
});
// In a workflow step:
async function speakingStep({ context }) {
const result = await context.agents.voiceAgent.generate(
context.input.prompt
);
// Speak the result
const stream = await context.agents.voiceAgent.voice?.speak(result.text);
return {
text: result.text,
audioStream: stream,
};
}
Listing Available Speakers
ModelsLab's TTS API supports dozens of neural voices. Fetch them dynamically:
const speakers = await voice.getSpeakers();
console.log(speakers);
// [
// { voiceId: "en-US-AriaNeural", name: "Aria (US English)" },
// { voiceId: "en-US-GuyNeural", name: "Guy (US English)" },
// { voiceId: "en-GB-SoniaNeural", name: "Sonia (British English)" },
// ... and more
// ]
Environment Variables
The provider reads your API key from MODELSLAB_API_KEY. Add this to your .env:
MODELSLAB_API_KEY=your_key_here
Get your key at modelslab.com/dashboard — the free tier includes enough credits to prototype your voice agent.
Why ModelsLab for Mastra TTS?
There are several TTS providers available in Mastra. Here's how ModelsLab compares:
- Price: ModelsLab's TTS API is among the most cost-effective for high-volume applications — pay-per-character, no monthly minimums.
- Latency: Designed for API-first workloads, with response times tuned for agentic systems (not just consumer UIs).
- Models: Access to multiple voice synthesis models from a single API key — same key you use for image, video, and LLM APIs.
- No lock-in: Mastra's common
MastraVoiceinterface means you can swap providers without changing your agent code.
Full Example: A Narrating Research Agent
Put it all together: an agent that researches a topic and narrates its findings:
import { Agent } from "@mastra/core/agent";
import { ModelsLabVoice } from "@mastra/voice-modelslab";
import { createWriteStream } from "fs";
import { pipeline } from "stream/promises";
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic();
const researchAgent = new Agent({
name: "NarratingResearcher",
instructions: `You are a research assistant. When given a topic, provide a
concise 3-sentence summary suitable for text-to-speech narration.
Avoid markdown, bullet points, or special characters.`,
model: anthropic("claude-3-5-sonnet-20241022"),
voice: new ModelsLabVoice({
apiKey: process.env.MODELSLAB_API_KEY!,
speaker: "en-US-AriaNeural",
}),
});
async function narrateTopic(topic: string) {
console.log(`Researching: ${topic}`);
const result = await researchAgent.generate(
`Give me a 3-sentence summary of: ${topic}`
);
console.log(`Text: ${result.text}`);
const audioStream = await researchAgent.voice?.speak(result.text);
if (audioStream) {
const filename = `${topic.replace(/\s+/g, "-")}.mp3`;
await pipeline(audioStream, createWriteStream(filename));
console.log(`Narration saved: ${filename}`);
}
}
narrateTopic("how REST APIs work");
Troubleshooting
Audio stream is empty
Check that your MODELSLAB_API_KEY is valid and has TTS credits. Test it directly:
curl -X POST https://modelslab.com/api/v6/voice/text_to_audio \
-H "Content-Type: application/json" \
-d '{"key":"YOUR_KEY","prompt":"Hello from ModelsLab","language":"en","voice_id":"en-US-AriaNeural"}'
TypeScript errors on agent.voice
Make sure you're on Mastra 0.10.0+ (when this provider ships). The voice property is typed as MastraVoice | undefined — use optional chaining (agent.voice?.speak()).
Next Steps
Now that your Mastra agent can speak, explore what else ModelsLab's API can do:
- Text generation — pair TTS with uncensored LLM responses
- Image generation — build agents that both describe and generate visuals
- Full API docs — 200+ models, one API key
ModelsLab's API is built for exactly this kind of multi-modal agentic work. Get your free API key and start building.
