LiteLLM Now Supports ModelsLab Video Generation
If you're already using LiteLLM to route AI requests, you can now generate videos through ModelsLab's API without changing a single line of your integration code. LiteLLM merged support for ModelsLab video generation in PR #22456 — meaning you can generate AI video clips using the same litellm.video_generation() interface you use for Sora or Veo, just swapping the model string.
This tutorial shows you exactly how to do it: from pip install to your first generated video in under 10 minutes.
Why Use LiteLLM for Video Generation?
LiteLLM is the standard library for developers who need to call multiple AI providers through one interface. The benefits for video generation specifically:
- One codebase, multiple providers: Switch between ModelsLab, Sora, Kling, and Veo by changing one string
- Built-in async polling: LiteLLM handles the job queue pattern (submit → poll → download) that all video APIs use
- Cost tracking: LiteLLM logs token/generation costs per call — useful for billing users
- Fallback chains: If one provider's queue is full, auto-route to the next
- OpenAI-compatible proxy: Run LiteLLM Proxy and every AI agent in your stack gets video generation through a single endpoint
Prerequisites
- Python 3.9+
- A ModelsLab API key — get one free at modelslab.com
- LiteLLM 1.65.0+ (the ModelsLab video provider landed in this release)
pip install "litellm>=1.65.0"
Basic Video Generation
Set your API key and generate your first clip:
import litellm
import os
os.environ["MODELSLAB_API_KEY"] = "your-modelslab-key-here"
response = litellm.video_generation(
model="modelslab/text-to-video",
prompt="A golden retriever running through autumn leaves in slow motion, cinematic",
seconds=4,
size="512x512"
)
print(f"Job ID: {response.id}")
print(f"Status: {response.status}")
ModelsLab video generation is asynchronous — the initial response gives you a job ID and status: "processing". You need to poll until it completes.
Polling for Completion
LiteLLM includes a video_status() helper for exactly this:
import litellm
import time
def generate_video(prompt: str, seconds: int = 4) -> bytes:
"""Generate a video and return the raw bytes when ready."""
# Submit the job
response = litellm.video_generation(
model="modelslab/text-to-video",
prompt=prompt,
seconds=seconds,
size="512x512"
)
job_id = response.id
print(f"Submitted job: {job_id}")
# Poll until done
while True:
status = litellm.video_status(video_id=job_id)
print(f"Status: {status.status}")
if status.status == "completed":
break
elif status.status == "failed":
raise RuntimeError(f"Video generation failed: {status}")
time.sleep(10)
# Download the video
video_bytes = litellm.video_content(video_id=job_id)
return video_bytes
# Run it
video = generate_video("A timelapse of city lights at night, aerial view")
with open("output.mp4", "wb") as f:
f.write(video)
print("Saved to output.mp4")
Available ModelsLab Video Models
ModelsLab exposes several video generation backends through the LiteLLM integration:
modelslab/text-to-video— Fast, general-purpose text-to-videomodelslab/text-to-video-v2— Higher quality, longer generation timemodelslab/image-to-video— Animate a still image into a clipmodelslab/wan2.2-text-to-video— Wan 2.2 model (open source, high quality)
Using LiteLLM Proxy for Team Access
If you're running LiteLLM Proxy (the router gateway), add ModelsLab video to your config.yaml:
model_list:
- model_name: modelslab-video
litellm_params:
model: modelslab/text-to-video
api_key: os.environ/MODELSLAB_API_KEY
# Fallback chain: try ModelsLab first, then Sora
- model_name: video-gen
litellm_params:
model: modelslab/text-to-video
api_key: os.environ/MODELSLAB_API_KEY
- model_name: video-gen
litellm_params:
model: openai/sora-2
api_key: os.environ/OPENAI_API_KEY
Now any service in your stack can hit POST /video_generation on the proxy with model: "video-gen" — and LiteLLM handles the routing automatically. If ModelsLab's queue is busy, it falls back to Sora without your app knowing.
Image Generation and TTS Are Also Supported
The same release that added video support also merged ModelsLab image generation (PR #21760) and text-to-speech (PR #22458). So if you're using LiteLLM for the full multimodal stack:
# Image generation
from litellm import image_generation
response = image_generation(
model="modelslab/realistic-vision-v6",
prompt="A developer typing code at a standing desk, soft light",
n=1,
size="1024x1024"
)
image_url = response.data[0].url
# Text-to-speech
from litellm import speech
response = speech(
model="modelslab/eleven-multilingual-v3",
input="Welcome to your AI-powered video studio.",
voice="alloy"
)
audio_bytes = response.content
Pricing
ModelsLab video generation costs approximately $0.01–$0.04 per 4-second clip depending on model and resolution — significantly cheaper than Sora ($0.15/clip) or Kling ($0.08/clip). For high-volume applications, the cost difference is substantial at scale.
Use LiteLLM's built-in cost tracking to monitor spend:
import litellm
litellm.success_callback = ["langfuse"] # or your preferred logging tool
response = litellm.video_generation(
model="modelslab/text-to-video",
prompt="Ocean waves at sunset",
seconds=4
)
# Cost is logged automatically
Get Started
ModelsLab's full video generation API — including WAN 2.2, image-to-video, and custom LoRA models — is available directly at modelslab.com/models/video-generation. If you're not using LiteLLM, you can call the API directly with a standard POST request. The LiteLLM integration just makes it portable across your multi-provider setup.
API docs, code samples, and free trial credits: modelslab.com/dashboard.