Happy Horse 1.0 is now on ModelsLab

Try Now
Skip to main content
Available now on ModelsLab · Language Model

LiquidAI: LFM2.5-1.2B-Instruct (free)Edge AI. No cloud costs.

Compact Power. Enterprise Speed.

Lightning-Fast Inference

239 tok/s on CPU

Blazing decode speeds on standard hardware with minimal latency overhead.

Minimal Footprint

Runs Under 1GB

Deploy on mobile, IoT, and vehicles without memory constraints or infrastructure.

Production-Ready

Tool Use Built-In

Function calling and multi-step reasoning out of the box for agentic workflows.

Examples

See what LiquidAI: LFM2.5-1.2B-Instruct (free) can create

Copy any prompt below and try it yourself in the playground.

Customer Support Bot

You are a helpful customer support assistant. A user asks: 'How do I reset my password?' Provide a clear, step-by-step response with tool calls to retrieve account information if needed.

Math Problem Solver

Solve this math problem step-by-step: A train travels 120 miles in 2.5 hours. Calculate the average speed and determine how long it takes to travel 300 miles at this rate.

Code Generation

Write a Python function that takes a list of numbers and returns the sum of all even numbers. Include error handling for non-numeric inputs.

Multi-Language Chat

Respond to this user in their preferred language: 'Bonjour, comment puis-je optimiser mon application pour les appareils mobiles?' Provide technical recommendations.

For Developers

A few lines of code.
1.2B model. Three lines.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

  • Serverless: scales to zero, scales to millions
  • Pay per token, no minimums
  • Python and JavaScript SDKs, plus REST API
import requests
response = requests.post(
"https://modelslab.com/api/v7/llm/chat/completions",
json={
"key": "YOUR_API_KEY",
"prompt": "",
"model_id": ""
}
)
print(response.json())

FAQ

Common questions about LiquidAI: LFM2.5-1.2B-Instruct (free)

Read the docs

It delivers best-in-class performance at 1.2B parameters through extended pretraining (28T tokens) and large-scale reinforcement learning. It rivals much larger models while running entirely on-device under 1GB memory.

Yes. The model excels at instruction following and tool calling out of the box. For advanced reasoning tasks, consider LFM2.5-1.2B-Thinking, which adds explicit reasoning traces and improves math and planning capabilities.

LFM2.5-1.2B-Instruct supports eight languages: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.

On mobile NPUs like Qualcomm Snapdragon Gen4, the model achieves 82 tok/s decode speed. On standard mobile CPUs, it reaches 70 tok/s with llama.cpp quantization.

The model supports 32,768 tokens of context and runs in under 1GB of memory on most devices, making it ideal for local deployment without cloud infrastructure.

The model has day-one support for llama.cpp, MLX, and vLLM. Additional optimizations are available through partners like AMD, Qualcomm, and Nexa AI for NPU deployment.

Ready to create?

Start generating with LiquidAI: LFM2.5-1.2B-Instruct (free) on ModelsLab.