Create & Edit Images Instantly with Google Nano Banana 2

Try Nano Banana 2 Now
Skip to main content

How to Use ModelsLab API with Claude Code: Images, Video & Audio

Adhik JoshiAdhik Joshi
||7 min read|API
How to Use ModelsLab API with Claude Code: Images, Video & Audio

Integrate AI APIs Today

Build next-generation applications with ModelsLab's enterprise-grade AI APIs for image, video, audio, and chat generation

Get Started
Get Started

Claude Code is quickly becoming the go-to AI coding agent for developers who want to ship fast. It writes code, debugs errors, and integrates with your existing workflow in the terminal. But one thing it doesn't do out of the box? Generate images, videos, or audio.

That's where ModelsLab API comes in. This guide walks through exactly how to connect Claude Code to ModelsLab's generation APIs so you can build AI-powered apps that go beyond text.

What You'll Build

By the end of this guide, you'll have a working setup where Claude Code can:

  • Generate images from text prompts using ModelsLab's text-to-image API
  • Create AI videos from descriptions using text-to-video endpoints
  • Convert text to lifelike speech using ModelsLab's TTS API
  • Chain these together in automated generation pipelines

Prerequisites

  • ModelsLab API keySign up at modelslab.com (free tier available)
  • Claude Code — Installed via npm install -g @anthropic-ai/claude-code
  • Basic familiarity with REST APIs and JSON
  • Node.js 18+ or Python 3.9+

Step 1: Get Your ModelsLab API Key

Head to your ModelsLab dashboard, navigate to API Keys, and create a new key. Store it securely — you'll reference it as an environment variable throughout this guide:

export MODELSLAB_API_KEY="your_key_here"

ModelsLab's API is pay-per-use with no monthly minimums, which makes it ideal for Claude Code workflows where you're iterating rapidly and don't want to pay for idle capacity.

Step 2: Generate Your First Image with ModelsLab

ModelsLab's text-to-image API supports hundreds of models — from FLUX.1 to Stable Diffusion XL. Here's the base request format Claude Code will work with:

curl -X POST "https://modelslab.com/api/v6/realtime/text2img" \
  -H "Content-Type: application/json" \
  -d '{
    "key": "'"$MODELSLAB_API_KEY"'",
    "prompt": "a photorealistic city skyline at night with neon lights reflecting on rain-wet streets",
    "negative_prompt": "blurry, low quality, distorted",
    "width": "1024",
    "height": "1024",
    "samples": "1",
    "num_inference_steps": "20",
    "safety_checker": "yes",
    "enhance_prompt": "yes",
    "guidance_scale": 7.5
  }'

The response returns a URL to your generated image within seconds. ModelsLab's realtime endpoint is designed for low-latency generation — ideal when you're inside an active coding session and don't want to wait.

Using It Inside Claude Code

Open a project in Claude Code and give it a task like this:

claude "Use the ModelsLab text2img API to generate a hero image for a SaaS landing page. 
The product is a developer dashboard. API key is in $MODELSLAB_API_KEY. 
Save the image to ./assets/hero.jpg and return the prompt you used."

Claude Code will write the API call, execute it, download the result, and tell you what prompt it generated. You can iterate from there — "make it darker," "add a developer at a laptop," etc.

Step 3: Video Generation

ModelsLab's text-to-video API lets you generate short video clips from a text description. This is useful for product demos, explainer content, and automated social media assets.

curl -X POST "https://modelslab.com/api/v6/video/text2video" \
  -H "Content-Type: application/json" \
  -d '{
    "key": "'"$MODELSLAB_API_KEY"'",
    "model_id": "cogvideox",
    "prompt": "A developer typing on a laptop in a modern office, code scrolling on multiple screens, cinematic lighting",
    "height": 480,
    "width": 854,
    "num_frames": 49,
    "num_inference_steps": 50
  }'

Video generation is asynchronous — the API returns a fetch URL. Poll it every few seconds until status is success:

curl -X POST "https://modelslab.com/api/v6/video/fetch" \
  -H "Content-Type: application/json" \
  -d '{"key": "'"$MODELSLAB_API_KEY"'", "request_id": "YOUR_REQUEST_ID"}'

Video Prompt in Claude Code

claude "Generate a 5-second product demo video for a data dashboard app using ModelsLab 
text-to-video API. Handle the async polling loop and save to ./demo.mp4 when done. 
API key: $MODELSLAB_API_KEY"

Claude Code handles the polling logic automatically — you can point it at the fetch endpoint and tell it to retry every 5 seconds until it gets a download URL.

Step 4: Text-to-Speech

ModelsLab's TTS API produces natural-sounding voices across dozens of voice profiles. Useful for generating voiceovers, podcast intros, or accessibility audio for your apps.

curl -X POST "https://modelslab.com/api/v6/voice/text_to_audio" \
  -H "Content-Type: application/json" \
  -d '{
    "key": "'"$MODELSLAB_API_KEY"'",
    "prompt": "Welcome to the ModelsLab developer dashboard. Your API usage this month has increased by 47 percent.",
    "language": "en",
    "speaker_id": "en-US-Neural2-J"
  }'

The response includes an output URL pointing to a WAV or MP3 file you can download directly.

Step 5: Build a Multi-Modal Pipeline

Here's where it gets interesting. Claude Code can orchestrate multiple ModelsLab API calls in sequence — generating a blog post image, a video teaser, and an audio summary in one command.

Create a file called generate-assets.sh and give Claude Code this prompt:

claude "Build a Node.js script called generate-assets.js that:
1. Takes a product name and description as CLI args
2. Calls ModelsLab text2img to generate a marketing image (1200x630, landscape)
3. Calls ModelsLab text2video to generate a 5s promo clip
4. Calls ModelsLab TTS to generate an audio summary
5. Saves all outputs to ./generated/ folder with timestamps
6. Prints a summary of URLs when done
API key is in process.env.MODELSLAB_API_KEY"

Claude Code will write the complete script — error handling, retry logic, file saving, and all. A task that would take a developer 2-3 hours to build from scratch is done in minutes.

Real-World Use Cases

Developers are using this combination for:

  • Automated content pipelines — Generate matching image + audio for every blog post automatically
  • Product mockup generation — Claude Code writes the prompt, ModelsLab generates the visual, iterate until it's right
  • E-commerce imagery — Batch-generate product photos with consistent styling from a CSV of descriptions
  • Video ad creation — Let Claude Code draft scripts, generate b-roll, stitch with audio narration
  • App prototyping — Generate placeholder UI images and audio assets during development

Cost Efficiency

ModelsLab's pricing is per-generation, not per-seat. For a typical Claude Code workflow:

  • Text-to-image (SDXL): ~$0.003/image
  • Text-to-video (8s clip): ~$0.05-0.10/video depending on model
  • TTS (per 1K characters): ~$0.001

Compared to alternatives — Midjourney, Runway ML, ElevenLabs — ModelsLab is significantly cheaper for API-based workflows because you're not paying for a GUI you don't use. The developer tier starts free and scales with usage.

Troubleshooting Common Errors

NSFW Filter Rejections

All ModelsLab endpoints run a safety checker by default. If your prompt triggers it, adjust the description to be less ambiguous — focus on scene, lighting, and composition rather than subject specifics.

Async Timeout Issues

Video generation can take 30-90 seconds. Add a retry limit to your polling loop and handle 200 responses with status: "processing" gracefully. Claude Code will do this if you specify it in your prompt.

Rate Limits

The free tier has concurrency limits. If you're hitting them during a batch job, add a sleep(2) between requests. Claude Code can add this automatically if you mention rate limits in your task.

What's Next

Once you have the basics working, you can go deeper:

  • Custom models — ModelsLab hosts thousands of community fine-tunes. Use a specific LoRA for brand-consistent imagery.
  • Image-to-image — Pass Claude Code an existing image and a transformation prompt. ModelsLab's img2img API handles the rest.
  • MCP integration — ModelsLab is available as an MCP server, meaning Claude Desktop and Claude Code can call generation endpoints natively without writing curl commands.
  • Webhooks — Set up a webhook endpoint for async jobs so your pipeline gets notified automatically instead of polling.

ModelsLab's API surface is broad — over 100 endpoints across image, video, audio, and 3D generation. Claude Code is the fastest way to explore it. Start with one endpoint, get it working, and expand from there.

Ready to get started? Create your free ModelsLab API key and try your first generation in under 5 minutes.

Share:
Adhik Joshi

Written by

Adhik Joshi

Plugins

Explore Plugins for Pro

Our plugins are designed to work with the most popular content creation software.

API

Build Apps with
ML
API

Use our API to build apps, generate AI art, create videos, and produce audio with ease.