LiteLLM TTS With ModelsLab Voice API: Setup Guide (2026)

LiteLLM's unified API router now supports text-to-speech (TTS) providers, and ModelsLab's Voice API is a clean fit. If you're already routing LLM calls through LiteLLM, you can add voice synthesis to the same stack without a separate SDK.

This guide walks through the complete setup: provider configuration, API call structure, multi-voice routing, and how ModelsLab's voice models compare on latency and quality for production use.

Why Route TTS Through LiteLLM?

LiteLLM's value for TTS is the same as for LLMs: a single API call format that works across providers, plus fallback routing, cost tracking, and observability via Langfuse or other tools. If your app already uses LiteLLM for GPT-4o or Claude, adding voice synthesis through the same proxy means unified logging, rate limit handling, and billing visibility.

For teams running production AI apps, having TTS in the same observability stack as your LLM calls makes debugging faster. You can trace a full request chain from input text to voice output in one place.

Prerequisites

LiteLLM v1.35.0+ (TTS support added in this version)
ModelsLab API key — get one at modelslab.com/api
Python 3.10+ or Node.js 18+

Step 1: Add ModelsLab as a TTS Provider in LiteLLM Config

In your litellm_config.yaml, add the ModelsLab TTS provider under the model_list:

model_list:
  - model_name: modelslab-tts
    litellm_params:
      model: modelslab/text_to_audio
      api_key: os.environ/MODELSLAB_API_KEY
      api_base: https://modelslab.com/api/v6/voice

model_name: modelslab-voice-clone
litellm_params:
model: modelslab/voice_clone
api_key: os.environ/MODELSLAB_API_KEY
api_base: https://modelslab.com/api/v6/voice

Set your environment variable:

export MODELSLAB_API_KEY="your_key_here"

Start the proxy:

litellm --config litellm_config.yaml --port 4000

Step 2: Make a TTS Request via LiteLLM

Once the proxy is running, call the OpenAI-compatible TTS endpoint:

from openai import OpenAI
client = OpenAI(
api_key="your-litellm-virtual-key",
base_url="http://localhost:4000"
),[object Object],
,[object Object],

response.stream_to_file("output.mp3")
print("Audio saved to output.mp3")

The call goes through LiteLLM's proxy, gets routed to ModelsLab's Voice API, and returns an MP3 stream.

Step 3: Direct ModelsLab Voice API (Without Proxy)

If you prefer calling ModelsLab's Voice API directly:

import requests
import time
API_KEY = "your_modelslab_api_key",[object Object],
,[object Object],
,[object Object],

audio_url = text_to_speech("Welcome to ModelsLab Voice API.")
print(f"Audio URL: {audio_url}")

Multi-Voice Configuration in LiteLLM

LiteLLM lets you define multiple ModelsLab voice endpoints and route between them:

model_list:,[object Object],
,[object Object],
,[object Object],

router_settings:
routing_strategy: cost-based-routing
model_group_alias:
tts-default: modelslab-narration
tts-branded: modelslab-branded-voice

Your application code stays clean:

response = client.audio.speech.create(
model="tts-default",
input="This will use the narration voice.",
voice="alloy"
)

Cost Tracking and Observability

LiteLLM logs TTS requests alongside LLM calls in its spend tracking dashboard:

curl http://localhost:4000/spend/logs?model=modelslab-tts

For Langfuse integration, add to your config:

litellm_settings:
success_callback: ["langfuse"]
failure_callback: ["langfuse"]
environment_variables:
LANGFUSE_PUBLIC_KEY: "pk-lf-..."
LANGFUSE_SECRET_KEY: "sk-lf-..."

Every TTS call gets traced with input text length, latency, cost, and output URL.

ModelsLab Voice Models Available

text_to_audio — Standard TTS with 20+ voices across 12 languages
voice_clone — Clone any voice from a 10-second audio sample
text_to_audio_with_sound — TTS with optional background audio mixing

All three are accessible via the same /api/v6/voice base URL with different endpoint suffixes.

Performance and Pricing

ModelsLab's Voice API processes most requests in 3-8 seconds, with sub-2-second latency for short text inputs under 100 characters. Pricing is per-character, making it competitive for production workloads that need volume.

Compared to ElevenLabs ($0.30/1K characters) and OpenAI TTS ($0.015/1K characters), ModelsLab's voice pricing sits in the mid-range with the advantage of being part of a unified multi-modal API that also covers image and video generation.

Next Steps

Full API documentation: docs.modelslab.com/voice/text-to-audio

Get your ModelsLab API key: modelslab.com/api

For RealtimeTTS integration (streaming voice synthesis for real-time applications), see the RealtimeTTS GitHub repository — ModelsLab support was added in PR #365.

LiteLLM Text-to-Speech: Using ModelsLab as Your TTS Provider 2026

Why Route TTS Through LiteLLM?

Prerequisites

Step 1: Add ModelsLab as a TTS Provider in LiteLLM Config

Step 2: Make a TTS Request via LiteLLM

Step 3: Direct ModelsLab Voice API (Without Proxy)

Multi-Voice Configuration in LiteLLM

Cost Tracking and Observability

ModelsLab Voice Models Available

Performance and Pricing

Next Steps

Explore Plugins for Pro

Build Apps with
ModelsLab
ML
API

LiteLLM Text-to-Speech: Using ModelsLab as Your TTS Provider 2026

Why Route TTS Through LiteLLM?

Prerequisites

Step 1: Add ModelsLab as a TTS Provider in LiteLLM Config

Step 2: Make a TTS Request via LiteLLM

Step 3: Direct ModelsLab Voice API (Without Proxy)

Multi-Voice Configuration in LiteLLM

Cost Tracking and Observability

ModelsLab Voice Models Available

Performance and Pricing

Next Steps

Explore Plugins for Pro

Build Apps with ModelsLabML API

Build Apps with
ModelsLab
ML
API