ModelsLab API + Claude Code: AI Image, Video & Audio Guide (2026)

Claude Code is quickly becoming the go-to AI coding agent for developers who want to ship fast. It writes code, debugs errors, and integrates with your existing workflow in the terminal. But one thing it doesn't do out of the box? Generate images, videos, or audio.

That's where ModelsLab API comes in. This guide walks through exactly how to connect Claude Code to ModelsLab's generation APIs so you can build AI-powered apps that go beyond text.

What You'll Build

By the end of this guide, you'll have a working setup where Claude Code can:

Generate images from text prompts using ModelsLab's text-to-image API
Create AI videos from descriptions using text-to-video endpoints
Convert text to lifelike speech using ModelsLab's TTS API
Chain these together in automated generation pipelines

Prerequisites

ModelsLab API key — Sign up at modelslab.com (free tier available)
Claude Code — Installed via npm install -g @anthropic-ai/claude-code
Basic familiarity with REST APIs and JSON
Node.js 18+ or Python 3.9+

Step 1: Get Your ModelsLab API Key

Head to your ModelsLab dashboard, navigate to API Keys, and create a new key. Store it securely — you'll reference it as an environment variable throughout this guide:

export MODELSLAB_API_KEY="your_key_here"

ModelsLab's API is pay-per-use with no monthly minimums, which makes it ideal for Claude Code workflows where you're iterating rapidly and don't want to pay for idle capacity.

Step 2: Generate Your First Image with ModelsLab

ModelsLab's text-to-image API supports hundreds of models — from FLUX.1 to Stable Diffusion XL. Here's the base request format Claude Code will work with:

curl -X POST "https://modelslab.com/api/v6/realtime/text2img" \
  -H "Content-Type: application/json" \
  -d '{
    "key": "'"$MODELSLAB_API_KEY"'",
    "prompt": "a photorealistic city skyline at night with neon lights reflecting on rain-wet streets",
    "negative_prompt": "blurry, low quality, distorted",
    "width": "1024",
    "height": "1024",
    "samples": "1",
    "num_inference_steps": "20",
    "safety_checker": "yes",
    "enhance_prompt": "yes",
    "guidance_scale": 7.5
  }'

The response returns a URL to your generated image within seconds. ModelsLab's realtime endpoint is designed for low-latency generation — ideal when you're inside an active coding session and don't want to wait.

Using It Inside Claude Code

Open a project in Claude Code and give it a task like this:

claude "Use the ModelsLab text2img API to generate a hero image for a SaaS landing page. 
The product is a developer dashboard. API key is in $MODELSLAB_API_KEY. 
Save the image to ./assets/hero.jpg and return the prompt you used."

Claude Code will write the API call, execute it, download the result, and tell you what prompt it generated. You can iterate from there — "make it darker," "add a developer at a laptop," etc.

Step 3: Video Generation

ModelsLab's text-to-video API lets you generate short video clips from a text description. This is useful for product demos, explainer content, and automated social media assets.

curl -X POST "https://modelslab.com/api/v6/video/text2video" \
  -H "Content-Type: application/json" \
  -d '{
    "key": "'"$MODELSLAB_API_KEY"'",
    "model_id": "cogvideox",
    "prompt": "A developer typing on a laptop in a modern office, code scrolling on multiple screens, cinematic lighting",
    "height": 480,
    "width": 854,
    "num_frames": 49,
    "num_inference_steps": 50
  }'

Video generation is asynchronous — the API returns a fetch URL. Poll it every few seconds until status is success:

curl -X POST "https://modelslab.com/api/v6/video/fetch" \
  -H "Content-Type: application/json" \
  -d '{"key": "'"$MODELSLAB_API_KEY"'", "request_id": "YOUR_REQUEST_ID"}'

Video Prompt in Claude Code

claude "Generate a 5-second product demo video for a data dashboard app using ModelsLab 
text-to-video API. Handle the async polling loop and save to ./demo.mp4 when done. 
API key: $MODELSLAB_API_KEY"

Claude Code handles the polling logic automatically — you can point it at the fetch endpoint and tell it to retry every 5 seconds until it gets a download URL.

Step 4: Text-to-Speech

ModelsLab's TTS API produces natural-sounding voices across dozens of voice profiles. Useful for generating voiceovers, podcast intros, or accessibility audio for your apps.

curl -X POST "https://modelslab.com/api/v6/voice/text_to_audio" \
  -H "Content-Type: application/json" \
  -d '{
    "key": "'"$MODELSLAB_API_KEY"'",
    "prompt": "Welcome to the ModelsLab developer dashboard. Your API usage this month has increased by 47 percent.",
    "language": "en",
    "speaker_id": "en-US-Neural2-J"
  }'

The response includes an output URL pointing to a WAV or MP3 file you can download directly.

Step 5: Build a Multi-Modal Pipeline

Here's where it gets interesting. Claude Code can orchestrate multiple ModelsLab API calls in sequence — generating a blog post image, a video teaser, and an audio summary in one command.

Create a file called generate-assets.sh and give Claude Code this prompt:

claude "Build a Node.js script called generate-assets.js that:
1. Takes a product name and description as CLI args
2. Calls ModelsLab text2img to generate a marketing image (1200x630, landscape)
3. Calls ModelsLab text2video to generate a 5s promo clip
4. Calls ModelsLab TTS to generate an audio summary
5. Saves all outputs to ./generated/ folder with timestamps
6. Prints a summary of URLs when done
API key is in process.env.MODELSLAB_API_KEY"

Claude Code will write the complete script — error handling, retry logic, file saving, and all. A task that would take a developer 2-3 hours to build from scratch is done in minutes.

Real-World Use Cases

Developers are using this combination for:

Automated content pipelines — Generate matching image + audio for every blog post automatically
Product mockup generation — Claude Code writes the prompt, ModelsLab generates the visual, iterate until it's right
E-commerce imagery — Batch-generate product photos with consistent styling from a CSV of descriptions
Video ad creation — Let Claude Code draft scripts, generate b-roll, stitch with audio narration
App prototyping — Generate placeholder UI images and audio assets during development

Cost Efficiency

ModelsLab's pricing is per-generation, not per-seat. For a typical Claude Code workflow:

Text-to-image (SDXL): ~$0.003/image
Text-to-video (8s clip): ~$0.05-0.10/video depending on model
TTS (per 1K characters): ~$0.001

Compared to alternatives — Midjourney, Runway ML, ElevenLabs — ModelsLab is significantly cheaper for API-based workflows because you're not paying for a GUI you don't use. The developer tier starts free and scales with usage.

Troubleshooting Common Errors

NSFW Filter Rejections

All ModelsLab endpoints run a safety checker by default. If your prompt triggers it, adjust the description to be less ambiguous — focus on scene, lighting, and composition rather than subject specifics.

Async Timeout Issues

Video generation can take 30-90 seconds. Add a retry limit to your polling loop and handle 200 responses with status: "processing" gracefully. Claude Code will do this if you specify it in your prompt.

Rate Limits

The free tier has concurrency limits. If you're hitting them during a batch job, add a sleep(2) between requests. Claude Code can add this automatically if you mention rate limits in your task.

What's Next

Once you have the basics working, you can go deeper:

Custom models — ModelsLab hosts thousands of community fine-tunes. Use a specific LoRA for brand-consistent imagery.
Image-to-image — Pass Claude Code an existing image and a transformation prompt. ModelsLab's img2img API handles the rest.
MCP integration — ModelsLab is available as an MCP server, meaning Claude Desktop and Claude Code can call generation endpoints natively without writing curl commands.
Webhooks — Set up a webhook endpoint for async jobs so your pipeline gets notified automatically instead of polling.

ModelsLab's API surface is broad — over 100 endpoints across image, video, audio, and 3D generation. Claude Code is the fastest way to explore it. Start with one endpoint, get it working, and expand from there.

Ready to get started? Create your free ModelsLab API key and try your first generation in under 5 minutes.

How to Use ModelsLab API with Claude Code: Images, Video & Audio 2026

What You'll Build

Prerequisites

Step 1: Get Your ModelsLab API Key

Step 2: Generate Your First Image with ModelsLab

Using It Inside Claude Code

Step 3: Video Generation

Video Prompt in Claude Code

Step 4: Text-to-Speech

Step 5: Build a Multi-Modal Pipeline

Real-World Use Cases

Cost Efficiency

Troubleshooting Common Errors

NSFW Filter Rejections

Async Timeout Issues

Rate Limits

What's Next

Explore Plugins for Pro

Build Apps with
ModelsLab
ML
API

How to Use ModelsLab API with Claude Code: Images, Video & Audio 2026

What You'll Build

Prerequisites

Step 1: Get Your ModelsLab API Key

Step 2: Generate Your First Image with ModelsLab

Using It Inside Claude Code

Step 3: Video Generation

Video Prompt in Claude Code

Step 4: Text-to-Speech

Step 5: Build a Multi-Modal Pipeline

Real-World Use Cases

Cost Efficiency

Troubleshooting Common Errors

NSFW Filter Rejections

Async Timeout Issues

Rate Limits

What's Next

Explore Plugins for Pro

Build Apps with ModelsLabML API

Build Apps with
ModelsLab
ML
API