CogVideoX API Documentation
by ModelsLabGenerates coherent text-to-video content at 720x480 resolution, 8 FPS, with up to 6-second clips, leveraging diffusion transformers and efficient 3D variational autoencoding for smooth motion and semantic alignment.
cogvideox