Wan2.6 Text To Video
Cinematic video. Fifteen seconds.
Generate. Sync. Storytell.
Multi-Shot Generation
Intelligent Scene Planning
Automatically splits complex prompts into distinct shots while maintaining character consistency and visual continuity.
Native Audio Sync
Phoneme-Level Lip Sync
Generates facial micro-expressions and lip movements aligned perfectly with input audio or text-to-speech scripts.
Extended Duration
Up to Fifteen Seconds
Create fuller narratives in single generation with expanded temporal and spatial capacity at 1080P resolution.
Examples
See what Wan2.6 Text To Video can create
Copy any prompt below and try it yourself in the playground.
Urban Timelapse
“Cinematic timelapse of a modern city skyline at sunset. Shot 1: Wide establishing shot of downtown towers with golden hour light. Shot 2: Close-up tracking through busy street with neon signs reflecting on wet pavement. Shot 3: Aerial view of traffic flowing through intersection. Smooth camera movements, film grain, professional color grading.”
Product Showcase
“Luxury watch product reveal. Shot 1: Macro close-up of watch face with light reflecting off crystal. Shot 2: Slow 360-degree rotation on white studio background. Shot 3: Wrist shot with watch in motion against minimalist interior. Crisp detail, studio lighting, shallow depth of field.”
Nature Documentary
“Mountain landscape sequence. Shot 1: Wide aerial view of snow-capped peaks at dawn. Shot 2: Push-in through misty valley with pine forest. Shot 3: Close-up of flowing alpine stream with rocks. Cinematic color grading, natural lighting, smooth camera transitions.”
Talking Head
“Professional speaking to camera. Person in business attire sits at desk, looks directly at camera and says: 'Welcome to our presentation.' Soft studio lighting, neutral background, natural facial expressions, clear audio sync.”
For Developers
A few lines of code.
Cinematic video. Three lines.
ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.
- Serverless: scales to zero, scales to millions
- Pay per second, no minimums
- Python and JavaScript SDKs, plus REST API
import requestsresponse = requests.post("https://modelslab.com/api/v7/video-fusion/text-to-video",json={"key": "YOUR_API_KEY","prompt": "I man talking towards camera from great wall of china and saying, Welcome to my vlogs the beautiful views from this place is breathetaking and amazing you should also come here","init_audio": "https://assets.modelslab.ai/generations/74c4f2e6-2fa6-4d8f-a0e3-09ff1a94d9e1.mp3"})print(response.json())
Ready to create?
Start generating with Wan2.6 Text To Video on ModelsLab.