---
title: Grok Imagine Text To Video  API | ModelsLab
description: Generate stunning videos with native audio from text or images; up to 15s duration, 720p/480p resolution, 8 aspect ratios, 24fps—ideal for social clips.
url: https://modelslab.com/models/xai/grok-imagine-text-to-video.md
canonical: https://modelslab.com/models/xai/grok-imagine-text-to-video.md
type: product
component: Playground/Endpoint/Index
generated_at: 2026-04-08T10:11:21.132151Z
---

[![Grok Imagine Text To Video  thumbnail](https://assets.modelslab.ai/api-logos/01KMSWGVZBFDSH0VHDHYBM8GM4.jpg)](https://modelslab.com/models/xai)Grok Imagine Text To Video 
---

[by xAI](https://modelslab.com/models/xai)Generate stunning videos with native audio from text or images; up to 15s duration, 720p/480p resolution, 8 aspect ratios, 24fps—ideal for social clips and ads.

`grok-imagine-video-t2v`

Closed Source Model [LLMs.txt](https://modelslab.com/models/xai/grok-imagine-video-t2v/llms.txt)

[API Playground](/models/xai/grok-imagine-text-to-video) [API Documentation](/models/xai/grok-imagine-text-to-video/api)Vibe CodeRelated ModelsDeveloper SupportModel Specs

Input
---

Prompt 

A dramatic, high-intensity street scene where two massive bulls are fighting violently in the middle of a narrow urban street, dust and debris flying into the air, intense energy and raw power. The bulls clash head-to-head, horns locked, muscles flexing, hooves smashing against the road. Nearby buildings shake slightly from the impact. Shocked bystanders stand at a safe distance, some recording on their phones. Sunlight streams through the narrow street, creating cinematic lighting and dramatic shadows. Slow-motion moments show dust particles and sweat flying. Ultra-realistic, hyper-detailed textures, dynamic camera movement, cinematic depth of field, motion blur, dramatic atmosphere, action-packed, realistic physics, handheld camera feel, natural lighting, epic cinematic style.

Duration 

Advanced Settings Customize your input with more control.

Configure

Add FundsLogin to Generate

Per sec video generation cost **480P :- $0.06/sec & 720P :- $0.084/sec**

Output
---

Idle

Unknown content type

Open Source Alternatives
---

Explore open-source models that offer similar capabilities with full transparency and flexibility

 [View all open source models](https://modelslab.com/models?feature=video&provider=open-source-models)

[![SVD](https://images.stablediffusionapi.com/?Image=https://pub-3626123a908346a7a8be8d9295f44e26.r2.dev/livewire-tmp/vRDgVJCkNyxWUSxclogvFeMWlgP9rV-metac3ZkLndlYnA=-.webp)Popular](https://modelslab.com/models/modelslab/svd)[ModelsLab](https://modelslab.com/models/modelslab)

 [SVD

Open Source Model](https://modelslab.com/models/modelslab/svd)

[![CogVideoX](https://images.stablediffusionapi.com/?Image=https://pub-3626123a908346a7a8be8d9295f44e26.r2.dev/livewire-tmp/VmhYqa98ohanHj8vL6mQjkr5TG2sSS-metaY29ndmlkZW94LndlYnA=-.webp)Popular](https://modelslab.com/models/modelslab/cogvideox)[ModelsLab](https://modelslab.com/models/modelslab)

 [CogVideoX

Open Source Model](https://modelslab.com/models/modelslab/cogvideox)

[![wan2.1](https://images.stablediffusionapi.com/?Image=https://assets.modelslab.ai/generations/04d08a15-bc50-43e7-96e5-5342c249cf50.webp)](https://modelslab.com/models/modelslab/wan2.1)[ModelsLab](https://modelslab.com/models/modelslab)

 [wan2.1

Open Source Model](https://modelslab.com/models/modelslab/wan2.1)

Grok Imagine Text To Video  Readme
---

Generate stunning videos with native audio from text or images; up to 15s duration, 720p/480p resolution, 8 aspect ratios, 24fps—ideal for social clips and ads.

---

*This markdown version is optimized for AI agents and LLMs.*

**Links:**
- [Website](https://modelslab.com)
- [API Documentation](https://docs.modelslab.com)
- [Blog](https://modelslab.com/blog)

---
*Generated by ModelsLab - 2026-04-08*