GLM OCR
GLM OCR Extracts Everything
Parse Documents Accurately
Tops OmniDocBench
Scores 94.62 on OmniDocBench V1.5 for text, tables, formulas.
0.9B Parameters
Runs on Edge Devices
Deploys via vLLM, SGLang; low latency for GLM OCR API.
Multimodal Input
Handles PDFs Images
Processes JPG, PNG, PDFs up to 100 pages with layouts.
Examples
See what GLM OCR can create
Copy any prompt below and try it yourself in the playground.
Invoice Extraction
“Extract all text, tables, and key fields like date, amount, vendor from this invoice image in structured JSON format.”
Table Recognition
“Parse the complex table in this document image, output as Markdown preserving rows, columns, and formulas.”
Code Documentation
“Transcribe the code snippets and surrounding text from this screenshot, maintaining structure and syntax.”
Contract Analysis
“Identify sections, clauses, and tables in this contract PDF page, output in semantic Markdown.”
For Developers
A few lines of code.
OCR via GLM OCR API
ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.
- Serverless: scales to zero, scales to millions
- Pay per token, no minimums
- Python and JavaScript SDKs, plus REST API
import requestsresponse = requests.post("https://modelslab.com/api/v7/llm/chat/completions",json={"key": "YOUR_API_KEY","prompt": "","model_id": ""})print(response.json())