ModelsLab + Portkey AI Gateway: Fallbacks & Load Balancing 2026

If you're calling AI APIs directly and one provider goes down, your app goes down with it. Portkey AI Gateway solves this problem by acting as a universal routing layer between your application and your AI providers — handling fallbacks, load balancing, retries, and observability in a single place.

ModelsLab is now available as a Portkey-compatible provider, which means you can route image generation, video generation, and LLM calls through Portkey's gateway using ModelsLab's API. This guide covers how to set it up.

What Portkey AI Gateway Does

Portkey (portkey.ai, 7k+ GitHub stars) is an open-source AI gateway that gives you:

Automatic fallbacks: If your primary provider fails, route to a backup automatically
Load balancing: Distribute traffic across multiple providers by weight
Retries with exponential backoff: Handle transient 429s and 5xxs without manual retry logic
Request caching: Cache identical requests to cut costs and latency
Full observability: Log every request with latency, cost, and status to the Portkey dashboard
Virtual keys: Manage provider API keys centrally — your app code never stores secrets

For developers building on top of AI APIs, Portkey handles the reliability engineering that would otherwise require significant custom infrastructure.

Prerequisites

A Portkey account (free tier available)
A ModelsLab API key
Python 3.8+ or Node.js 18+

Step 1: Add ModelsLab as a Virtual Key in Portkey

Virtual keys in Portkey store your provider API keys centrally. Your application code uses a Portkey virtual key reference — never the raw API key itself.

Go to the Portkey dashboard → Virtual Keys
Click Add Virtual Key
Select ModelsLab from the provider list
Enter your ModelsLab API key
Give it a name like modelslab-prod
Copy the returned virtual key ID (format: pk-xxxxx)

Step 2: Install the Portkey SDK

# Python
pip install portkey-ai
# Node.js
npm install portkey-ai

Step 3: Make Your First ModelsLab Request via Portkey

Portkey wraps your API calls with a standard interface. For ModelsLab image generation:

from portkey_ai import Portkey
portkey = Portkey(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="pk-your-modelslab-virtual-key"
)
,[object Object],

print(response.data[0].url)

For Node.js:

import Portkey from 'portkey-ai';
const client = new Portkey({
apiKey: 'YOUR_PORTKEY_API_KEY',
virtualKey: 'pk-your-modelslab-virtual-key'
});,[object Object],
,[object Object],

console.log(response.data[0].url);

Step 4: Add a Fallback Provider

The real power of Portkey comes from its routing configs. A fallback config automatically switches to a backup provider if the primary fails:

from portkey_ai import Portkey, createConfig,[object Object],
,[object Object],
,[object Object],
,[object Object],
,[object Object],
,[object Object],

response = portkey.images.generate(
prompt="Photorealistic mountain landscape at sunset",
n=1
)

Step 5: Load Balance Across API Keys

If you have multiple ModelsLab API keys (for rate limit distribution or cost tracking), Portkey can load balance across them by weight:

from portkey_ai import Portkey, createConfig
config = createConfig({
"strategy": {
"mode": "loadbalance"
},
"targets": [
{
"virtual_key": "pk-modelslab-key-1",
"weight": 0.6
},
{
"virtual_key": "pk-modelslab-key-2",
"weight": 0.4
}
]
}),[object Object],

portkey = Portkey(api_key="YOUR_PORTKEY_API_KEY", config=config)
response = portkey.images.generate(prompt="...", n=1)

60% of traffic routes to key-1, 40% to key-2. Portkey handles the distribution automatically on every request.

Step 6: Enable Request Caching

For identical prompts — common in development, testing, or templated content pipelines — caching eliminates redundant API calls entirely:

portkey = Portkey(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="pk-modelslab-prod",
cache={
"mode": "semantic",   # or "simple" for exact-match only
"max_age": 86400      # cache for 24 hours
}
),[object Object],
,[object Object],
,[object Object],
,[object Object],

r2 = portkey.images.generate(prompt="sunset over Tokyo", n=1)

Semantic caching matches semantically similar prompts, not just byte-identical ones. Useful when prompt templates have minor variations.

Step 7: View Request Logs

Every request made through Portkey is logged automatically. In the Portkey dashboard → Logs you can see:

Request timestamp, prompt, and response
Latency per request and per provider
Which virtual key and config handled each call
Cache hit/miss status
Cost per request (where provider pricing is known)

This is substantially faster than wiring up custom logging and avoids the overhead of building your own observability layer.

Self-Hosted Portkey Gateway (Optional)

Portkey's core gateway is open-source and can be self-hosted if you need requests to stay on your infrastructure:

# Docker
docker pull portkeyai/gateway:latest
docker run -p 8787:8787 portkeyai/gateway:latest,[object Object],
,[object Object],

Self-hosting is useful for compliance requirements (HIPAA, SOC 2) where request logs must stay within your environment.

Why ModelsLab + Portkey for Production AI Apps

ModelsLab provides access to Flux, Stable Diffusion, video generation, audio generation, and LLM APIs from a single key. Portkey adds the reliability and observability layer on top. Together they give you:

High-availability image and video generation with automatic failover
Cost tracking across all AI provider calls in a unified dashboard
Multi-key rate limit management without custom retry logic
Semantic request caching to reduce spend on repeated generations

For teams running AI features in production, the combination reduces engineering time spent on infrastructure and gives you the reliability properties that direct provider calls can't deliver on their own.

Get Your ModelsLab API Key

ModelsLab provides access to 200+ AI models — image generation, video generation, audio, and LLMs — from a single API. Integrates directly with Portkey AI Gateway for production reliability.

Get Free API Key →

How to Use ModelsLab with Portkey AI Gateway: Fallbacks, Load Balancing & Caching 2026

What Portkey AI Gateway Does

Prerequisites

Step 1: Add ModelsLab as a Virtual Key in Portkey

Step 2: Install the Portkey SDK

Step 3: Make Your First ModelsLab Request via Portkey

Step 4: Add a Fallback Provider

Step 5: Load Balance Across API Keys

Step 6: Enable Request Caching

Step 7: View Request Logs

Self-Hosted Portkey Gateway (Optional)

Why ModelsLab + Portkey for Production AI Apps

Get Your ModelsLab API Key

Explore Plugins for Pro

Build Apps with
ModelsLab
ML
API

How to Use ModelsLab with Portkey AI Gateway: Fallbacks, Load Balancing & Caching 2026

What Portkey AI Gateway Does

Prerequisites

Step 1: Add ModelsLab as a Virtual Key in Portkey

Step 2: Install the Portkey SDK

Step 3: Make Your First ModelsLab Request via Portkey

Step 4: Add a Fallback Provider

Step 5: Load Balance Across API Keys

Step 6: Enable Request Caching

Step 7: View Request Logs

Self-Hosted Portkey Gateway (Optional)

Why ModelsLab + Portkey for Production AI Apps

Get Your ModelsLab API Key

Explore Plugins for Pro

Build Apps with ModelsLabML API

Build Apps with
ModelsLab
ML
API