Available now on ModelsLab · Language Model

LiquidAI: LFM2.5-1.2B-Instruct (free)
Edge AI. No cloud costs.

Try LiquidAI: LFM2.5-1.2B-Instruct (free)API Documentation

Compact Power. Enterprise Speed.

Lightning-Fast Inference

239 tok/s on CPU

Blazing decode speeds on standard hardware with minimal latency overhead.

Minimal Footprint

Runs Under 1GB

Deploy on mobile, IoT, and vehicles without memory constraints or infrastructure.

Production-Ready

Tool Use Built-In

Function calling and multi-step reasoning out of the box for agentic workflows.

Examples

See what LiquidAI: LFM2.5-1.2B-Instruct (free) can create

Copy any prompt below and try it yourself in the playground.

Customer Support Bot

“You are a helpful customer support assistant. A user asks: 'How do I reset my password?' Provide a clear, step-by-step response with tool calls to retrieve account information if needed.”

Math Problem Solver

“Solve this math problem step-by-step: A train travels 120 miles in 2.5 hours. Calculate the average speed and determine how long it takes to travel 300 miles at this rate.”

Code Generation

“Write a Python function that takes a list of numbers and returns the sum of all even numbers. Include error handling for non-numeric inputs.”

Multi-Language Chat

“Respond to this user in their preferred language: 'Bonjour, comment puis-je optimiser mon application pour les appareils mobiles?' Provide technical recommendations.”

For Developers

A few lines of code.
1.2B model. Three lines.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

Serverless: scales to zero, scales to millions
Pay per token, no minimums
Python and JavaScript SDKs, plus REST API

API Documentation

import requests

response = requests.post(
    "https://modelslab.com/api/v7/llm/chat/completions",
    json={
  "key": "YOUR_API_KEY",
  "prompt": "",
  "model_id": ""
}
)
print(response.json())

FAQ

Common questions about LiquidAI: LFM2.5-1.2B-Instruct (free)

Read the docs

It delivers best-in-class performance at 1.2B parameters through extended pretraining (28T tokens) and large-scale reinforcement learning. It rivals much larger models while running entirely on-device under 1GB memory.

Yes. The model excels at instruction following and tool calling out of the box. For advanced reasoning tasks, consider LFM2.5-1.2B-Thinking, which adds explicit reasoning traces and improves math and planning capabilities.

LFM2.5-1.2B-Instruct supports eight languages: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.

On mobile NPUs like Qualcomm Snapdragon Gen4, the model achieves 82 tok/s decode speed. On standard mobile CPUs, it reaches 70 tok/s with llama.cpp quantization.

The model supports 32,768 tokens of context and runs in under 1GB of memory on most devices, making it ideal for local deployment without cloud infrastructure.

The model has day-one support for llama.cpp, MLX, and vLLM. Additional optimizations are available through partners like AMD, Qualcomm, and Nexa AI for NPU deployment.

Ready to create?

Start generating with LiquidAI: LFM2.5-1.2B-Instruct (free) on ModelsLab.

Try LiquidAI: LFM2.5-1.2B-Instruct (free)API Documentation

LiquidAI: LFM2.5-1.2B-Instruct (free)Edge AI. No cloud costs.

Compact Power. Enterprise Speed.

239 tok/s on CPU

Runs Under 1GB

Tool Use Built-In

See what LiquidAI: LFM2.5-1.2B-Instruct (free) can create

A few lines of code.1.2B model. Three lines.

Common questions about LiquidAI: LFM2.5-1.2B-Instruct (free)

What makes LFM2.5-1.2B-Instruct different from larger models?

Can I use LFM2.5-1.2B-Instruct for reasoning and tool use?

What languages does this model support?

How fast is the inference speed on mobile devices?

What's the context length and memory requirement?

Which frameworks support LFM2.5-1.2B-Instruct?

Ready to create?

LiquidAI: LFM2.5-1.2B-Instruct (free)
Edge AI. No cloud costs.

A few lines of code.
1.2B model. Three lines.