🔥 20% OFF All Kling Models

Inworld Text to Speech thumbnail

Inworld Text To Speech

Ultra-realistic, low-latency voice cloning supports 11 languages, instant & professional cloning, 48 kHz audio, fine emotional control, API access—ideal for dynamic, expressive AI interactions.

inworld-tts-1

Closed Source ModelLLMs.txt Learn more

Input

Per million characters will cost 6$

Output

Idle

Unknown content type

Related Models

Discover similar models you might be interested in

View all Audio Models

eleven_sound_effect

eleven_sound_effect

Closed Source Model

song extender

song extender

Closed Source Model

Qwen Text to Speech

Qwen Text to Speech

Open Source Model

Modi

Modi

Open Source Model

Kanye West

Kanye West

Open Source Model

Shreya Ghoshal

Shreya Ghoshal

Open Source Model

Mina

Mina

Open Source Model

eleven_multilingual_v2

eleven_multilingual_v2

Closed Source Model

Elevenlabs Voice Changer

Elevenlabs Voice Changer

Closed Source Model

Music

Music

Closed Source Model

Baldi

Baldi

Open Source Model

Song generation

Song generation

Closed Source Model

scribe_v1

scribe_v1

Closed Source Model

song inpaint

song inpaint

Closed Source Model

Voice isolation

Voice isolation

Open Source Model

epiCRealism XL - VXVII - CrystalClear (Realism)

epiCRealism XL - VXVII - CrystalClear (Realism)

Open Source Model

Elmesia El-Ru Sarion - That Time I Got Reincarnated as a Slime 1.5/Pony - Elmesia El-Ru Sarion

Elmesia El-Ru Sarion - That Time I Got Reincarnated as a Slime 1.5/Pony - Elmesia El-Ru Sarion

Open Source Model

Personal CKPT - v4.0- PersonalxAnimagine

Personal CKPT - v4.0- PersonalxAnimagine

Open Source Model

Open Source Alternatives

Explore open-source models that offer similar capabilities with full transparency and flexibility

View all open source models

Modi

Modi

Open Source Model

Kanye West

Kanye West

Open Source Model

Baldi

Baldi

Open Source Model

Shreya Ghoshal

Shreya Ghoshal

Open Source Model

Mina

Mina

Open Source Model

Voice isolation

Voice isolation

Open Source Model

About Inworld Text To Speech

Ultra-realistic, low-latency voice cloning supports 11 languages, instant & professional cloning, 48 kHz audio, fine emotional control, API access—ideal for dynamic, expressive AI interactions.

Technical Specifications

Model ID: inworld-tts-1
Provider: Inworld
Category: Audio Models
Task: Voice Cloning
Price: $6 per million characters
Added: August 7, 2025

Key Features

AI voice synthesis and text-to-speech
Multiple language and accent support
Voice cloning from short audio samples
Real-time audio processing via API
Customizable speech parameters

Quick Start

Integrate Inworld Text To Speech into your application with a single API call. Get your API key from the pricing page to get started.

import requests
import json

url = "https://modelslab.com/api/v7/voice/text-to-speech"

headers = {
    "Content-Type": "application/json"
}

data = {
        "model_id": "inworld-tts-1",
        "prompt": "your prompt here",
        "key": "YOUR_API_KEY"
    }

try:
    response = requests.post(url, headers=headers, json=data)
    response.raise_for_status()  # Raises an HTTPError for bad responses (4XX or 5XX)
    result = response.json()
    print("API Response:")
    print(json.dumps(result, indent=2))
except requests.exceptions.HTTPError as http_err:
    print(f"HTTP error occurred: {http_err} - {response.text}")
except Exception as err:
    print(f"Other error occurred: {err}")

Pricing

Inworld Text To Speech API costs $6.000000 per million characters. Pay only for what you use with no minimum commitments. View pricing plans

Use Cases

Voice-over production for video content
Podcast and audiobook narration
Multilingual customer support automation
Interactive voice response (IVR) systems

Learn more about Inworld Text To Speech Browse Audio Models More from Inworld View Pricing

Inworld Text To Speech FAQ