What AI models are available on ModelsLab?

ModelsLab offers 10,000+ AI models including text-to-image (Stable Diffusion, FLUX, SDXL), text-to-video, text-to-audio, 3D generation, voice cloning, and LLM APIs from providers like Alibaba, Google, Meta, and more.

How do I get started with ModelsLab APIs?

Sign up for a free account, get your API key from the dashboard, and start making API calls immediately. We provide SDKs for Python, JavaScript, and cURL examples for every model.

What are the pricing options for ModelsLab?

ModelsLab plans start at $21/month (Basic), with the $149/month Open Source plan covering unlimited generation on open-source models. Enterprise plans with custom pricing and dedicated support are also available.

What programming languages are supported?

ModelsLab APIs work with any language that can make HTTP requests — Python, JavaScript, PHP, Ruby, Go, Rust, Java, and more. We provide official SDKs, CLI tools, and MCP server integrations.

Do you offer API documentation?

Yes, every model on ModelsLab has comprehensive API documentation with code examples, parameter descriptions, and integration guides available at docs.modelslab.com and on each model's API Documentation tab.

Audio & Music Generation AI Models & APIs

Lyria 3 is a cutting-edge AI music generator that creates original 30-second tracks from text prompts and images. Instantly transform your ideas into unique soundscapes.

Closed SourceNew AddedMusic ProductionBest song generationBest SFX30 second output

Create and customize any AI-generated voice you can imagine using a simple text prompt - choose the tone, style, accent, emotion, age, or personality, and instantly turn your words into natural-sounding speech.

Open SourceVoice DesignerUltra NaturalNew Added

The Qwen Text-to-Speech endpoint generates audio from text using a provided audio URL, producing output that mimics the uploaded voice

Open Source3-Sec Voice CloneSupport 10 Languages

Audio Inpaint intelligently reconstructs missing or corrupted portions of an audio clip. Whether you need to remove unwanted noises, repair damaged recordings, or fill silent gaps, the model analyzes the surrounding context to generate smooth.

Closed SourceNew Added

This endpoint allows clients to extend an existing song / vocal audio track by generating additional material

Closed SourceNew Added

Generate full songs from text, lyrics, or melodies with a latent diffusion-powered AI music model offering up to 4:45 min tracks, voice control, and seamless editing.

Closed SourceNew Added

Eleven Music is cleared for nearly all commercial uses, from film and television to podcasts and social media videos, and from advertisements to gaming

Closed SourceMusic ProductionBest song generation

The Text-to-Audio endpoint enables you to generate audio by providing a text input along with a valid audio URL or a pre-created voice using a voice_id. The output is an audio file that mimics the sound of the provided audio URL or the selected voice.

Closed SourceHigh Quality OutputSupport 30+ Languages

Generate up to 30 seconds of professional, royalty-free sound effects from text prompts with customizable duration, looping, and multiple MP3 output formats at 44.1 kHz.

Closed SourceCheapest PriceBest SFX

Transform one voice into another in using advanced speech-to-speech technology. Perfect for dubbing, content creation, and voice customization without altering the original message.

Closed SourceBest for Creators

The Text-to-Audio endpoint enables you to generate audio by providing a text input along with a valid audio URL or a pre-created voice using a voice_id. The output is an audio file that mimics the sound of the provided audio URL or the selected voice.

Closed SourceSupport 30+ LanguagesTrending

The endpoint enables automatic voice translation of videos from one language to another. It accepts a video file link and various parameters to control the dubbing process.

Open Source

The SFX endpoint allows you to generate sound effects (SFX) from text prompts. It takes user input in the form of a text prompt to conditionally generate audio effects.

Open Source

Speech-to-Text transforms audio into written transcription, allowing spoken language to be converted into text for various applications.

Open Source

Generate original, genre-specific song lyrics instantly using advanced NLP and machine learning—customize by theme, mood, or language, perfect for musicians and content creators seeking fresh, copyright-free lyrics.

Open Source

Generate high-quality songs in 50+ languages by providing lyrics and reference audio using the ACE-Step v1.5 model, which accurately matches voice, melody, tone, and emotion for professional results.

Open Source

AI APIs for Developers

Popular Models

AI Model APIs