Happy Horse 1.0 is now on ModelsLab

Try Now
Skip to main content
Available now on ModelsLab · Language Model

Google: Gemma 3n 2B (free)Free Multimodal Gemma Power

Run Efficiently Anywhere

E2B Efficiency

2B Effective Parameters

MatFormer and PLE caching reduce memory to 2B scale from 6B raw parameters.

Multimodal Input

Text Images Audio

Process text, images, audio inputs with 8192 token context window.

Zero Cost

Free API Access

Use Google: Gemma 3n 2B (free) via OpenRouter at $0 input output per million tokens.

Examples

See what Google: Gemma 3n 2B (free) can create

Copy any prompt below and try it yourself in the playground.

Cityscape Analysis

Analyze this urban skyline image: describe architecture styles, estimate building heights, suggest sustainable improvements.

Nature Sound ID

Identify species from this forest audio clip: birdsong patterns, insect noises, wind through trees.

Tech Diagram Explain

Explain this circuit board image step-by-step: components, connections, potential failure points.

Product Mockup

Generate marketing copy for this smartphone render: highlight camera, screen, battery features.

For Developers

A few lines of code.
Gemma 3n. Zero lines extra.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

  • Serverless: scales to zero, scales to millions
  • Pay per token, no minimums
  • Python and JavaScript SDKs, plus REST API
import requests
response = requests.post(
"https://modelslab.com/api/v7/llm/chat/completions",
json={
"key": "YOUR_API_KEY",
"prompt": "",
"model_id": ""
}
)
print(response.json())

FAQ

Common questions about Google: Gemma 3n 2B (free)

Read the docs

Instruction-tuned multimodal model from Google DeepMind with E2B architecture. Handles text, images, audio on mobile devices. Free via OpenRouter API.

Use OpenRouter endpoint with model ID google/gemma-3n-e2b-it:free. OpenAI-compatible chat completions. Costs $0 per token.

Supports 8192 tokens. Suitable for extended conversations or document analysis. Parameters include temperature, top_p, max_tokens.

Trained on 140+ languages with strong benchmarks like MGSM 53.1% accuracy. Optimized for Italian among others.

50% lower memory than E4B, 40% faster inference on mid-range devices. Open weights for commercial use.

Designed for edge devices with LiteRT-LM runtime and NPU acceleration. Use Hugging Face transformers for deployment.

Ready to create?

Start generating with Google: Gemma 3n 2B (free) on ModelsLab.