Deploy Dedicated GPU server to run AI models

Deploy Model
Skip to main content

How to Add Real-Time Voice to Your Mastra Agent with ModelsLab TTS

||5 min read|Tutorials
How to Add Real-Time Voice to Your Mastra Agent with ModelsLab TTS

Integrate AI APIs Today

Build next-generation applications with ModelsLab's enterprise-grade AI APIs for image, video, audio, and chat generation

Get Started
Get Started

Why Add Voice to Your Mastra Agent?

Mastra is one of the fastest-growing TypeScript AI agent frameworks — 21,000 GitHub stars and climbing. Out of the box it handles memory, tools, and workflows. What it hasn't had until now: a first-class TTS (text-to-speech) integration with ModelsLab's voice API.

The new @mastra/voice-modelslab provider (PR #13627, merged March 2026) follows the same pattern as the ElevenLabs, Murf, and Deepgram integrations — drop-in, typed, and ready in under 10 minutes.

This tutorial walks you through adding real-time voice output to any Mastra agent using ModelsLab TTS.

Prerequisites

  • Node.js 18+
  • An existing Mastra project (or npx create-mastra@latest)
  • A ModelsLab API key (free tier available)

Install the Provider

npm install @mastra/voice-modelslab
# or
pnpm add @mastra/voice-modelslab

Basic Setup: Add Voice to an Agent

Mastra's voice system is built on a common MastraVoice interface. The ModelsLab provider implements speak() and getSpeakers() — the same API as ElevenLabs and Deepgram, so switching providers requires zero agent rewrites.

import { Agent } from "@mastra/core/agent";
import { ModelsLabVoice } from "@mastra/voice-modelslab";

const voice = new ModelsLabVoice({
  apiKey: process.env.MODELSLAB_API_KEY!, // get at modelslab.com
  speaker: "en-US-AriaNeural", // optional: defaults to en-US-JennyNeural
});

const agent = new Agent({
  name: "VoiceAssistant",
  instructions: "You are a helpful assistant that responds concisely.",
  model: anthropic("claude-3-5-sonnet-20241022"),
  voice, // attach the voice provider here
});

Generating Speech from Agent Output

Once voice is attached, call agent.speak() to convert any string to an audio stream:

import { createWriteStream } from "fs";
import { pipeline } from "stream/promises";

// Generate a text response from the agent
const response = await agent.generate("Explain what a REST API is in 2 sentences.");
const text = response.text;

// Convert to speech
const audioStream = await agent.voice?.speak(text, {
  speaker: "en-US-GuyNeural", // override per-call
});

if (audioStream) {
  // Save to file
  await pipeline(audioStream, createWriteStream("output.mp3"));
  console.log("Audio saved to output.mp3");
}

Real-Time Voice Workflows

The ModelsLab provider supports Mastra's streaming workflow architecture. Here's a pattern for an agent that reads its responses aloud during generation:

import { Mastra } from "@mastra/core";
import { ModelsLabVoice } from "@mastra/voice-modelslab";

const mastra = new Mastra({
  agents: { voiceAgent: agent },
});

// In a workflow step:
async function speakingStep({ context }) {
  const result = await context.agents.voiceAgent.generate(
    context.input.prompt
  );

  // Speak the result
  const stream = await context.agents.voiceAgent.voice?.speak(result.text);

  return {
    text: result.text,
    audioStream: stream,
  };
}

Listing Available Speakers

ModelsLab's TTS API supports dozens of neural voices. Fetch them dynamically:

const speakers = await voice.getSpeakers();
console.log(speakers);
// [
//   { voiceId: "en-US-AriaNeural", name: "Aria (US English)" },
//   { voiceId: "en-US-GuyNeural", name: "Guy (US English)" },
//   { voiceId: "en-GB-SoniaNeural", name: "Sonia (British English)" },
//   ... and more
// ]

Environment Variables

The provider reads your API key from MODELSLAB_API_KEY. Add this to your .env:

MODELSLAB_API_KEY=your_key_here

Get your key at modelslab.com/dashboard — the free tier includes enough credits to prototype your voice agent.

Why ModelsLab for Mastra TTS?

There are several TTS providers available in Mastra. Here's how ModelsLab compares:

  • Price: ModelsLab's TTS API is among the most cost-effective for high-volume applications — pay-per-character, no monthly minimums.
  • Latency: Designed for API-first workloads, with response times tuned for agentic systems (not just consumer UIs).
  • Models: Access to multiple voice synthesis models from a single API key — same key you use for image, video, and LLM APIs.
  • No lock-in: Mastra's common MastraVoice interface means you can swap providers without changing your agent code.

Full Example: A Narrating Research Agent

Put it all together: an agent that researches a topic and narrates its findings:

import { Agent } from "@mastra/core/agent";
import { ModelsLabVoice } from "@mastra/voice-modelslab";
import { createWriteStream } from "fs";
import { pipeline } from "stream/promises";
import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic();

const researchAgent = new Agent({
  name: "NarratingResearcher",
  instructions: `You are a research assistant. When given a topic, provide a 
  concise 3-sentence summary suitable for text-to-speech narration. 
  Avoid markdown, bullet points, or special characters.`,
  model: anthropic("claude-3-5-sonnet-20241022"),
  voice: new ModelsLabVoice({
    apiKey: process.env.MODELSLAB_API_KEY!,
    speaker: "en-US-AriaNeural",
  }),
});

async function narrateTopic(topic: string) {
  console.log(`Researching: ${topic}`);
  
  const result = await researchAgent.generate(
    `Give me a 3-sentence summary of: ${topic}`
  );

  console.log(`Text: ${result.text}`);
  
  const audioStream = await researchAgent.voice?.speak(result.text);
  if (audioStream) {
    const filename = `${topic.replace(/\s+/g, "-")}.mp3`;
    await pipeline(audioStream, createWriteStream(filename));
    console.log(`Narration saved: ${filename}`);
  }
}

narrateTopic("how REST APIs work");

Troubleshooting

Audio stream is empty

Check that your MODELSLAB_API_KEY is valid and has TTS credits. Test it directly:

curl -X POST https://modelslab.com/api/v6/voice/text_to_audio \
  -H "Content-Type: application/json" \
  -d '{"key":"YOUR_KEY","prompt":"Hello from ModelsLab","language":"en","voice_id":"en-US-AriaNeural"}'

TypeScript errors on agent.voice

Make sure you're on Mastra 0.10.0+ (when this provider ships). The voice property is typed as MastraVoice | undefined — use optional chaining (agent.voice?.speak()).

Next Steps

Now that your Mastra agent can speak, explore what else ModelsLab's API can do:

ModelsLab's API is built for exactly this kind of multi-modal agentic work. Get your free API key and start building.

Share:
Plugins

Explore Plugins for Pro

Our plugins are designed to work with the most popular content creation software.

API

Build Apps with
ML
API

Use our API to build apps, generate AI art, create videos, and produce audio with ease.