Happy Horse 1.0 is now on ModelsLab

Try Now
Skip to main content
Available now on ModelsLab · Language Model

NousResearch: Hermes 2 Pro - Llama-3 8BFunction Calling Perfected

Master Tools and JSON

Core Upgrade

Retrain on OpenHermes 2.5

Uses cleaned OpenHermes 2.5 dataset plus in-house function calling data for reliable outputs.

90% Accuracy

Excels at Function Calls

Scores 90% on Fireworks.AI eval; supports multi-turn with ChatML and special tokens like <tools>.

JSON Mastery

Structured Outputs

Achieves 84% on JSON eval; enables agentic parsing during streaming with single-token tags.

Examples

See what NousResearch: Hermes 2 Pro - Llama-3 8B can create

Copy any prompt below and try it yourself in the playground.

JSON Weather Query

<|im_start|>system You are a helpful assistant with access to tools. Use JSON for structured responses.<|im_end|> <|im_start|>user Get current weather in Tokyo using tool.<|im_end|> <|im_start|>assistant <tools><tool_call id="1"><name>get_weather</name><parameters>{"city":"Tokyo"}</parameters></tool_call></tools><|im_end|>

Multi-Turn Calc

<|im_start|>system Handle math via calculator tool. Respond in JSON.<|im_end|> <|im_start|>user Calculate 25 * 4 + 17.<|im_end|> <|im_start|>assistant <tool_call id="calc"><parameters>{"expr":"25*4+17"}</parameters></tool_call><|im_end|>

Code Gen Task

<|im_start|>system Generate Python code. Output as JSON with reasoning.<|im_end|> <|im_start|>user Write function to sort list ascending.<|im_end|> <|im_start|>assistant <json>{"reasoning":"Use sorted() builtin.", "code":"def sort_list(lst): return sorted(lst)"}</json><|im_end|>

Agent Workflow

<|im_start|>system Parse tools for agent tasks. Use <tool_response> after calls.<|im_end|> <|im_start|>user Book flight to NYC on tool.<|im_end|> <|im_start|>assistant <tools><tool_call id="book_flight"><name>search_flights</name><parameters>{"dest":"NYC"}</parameters></tool_call></tools><|im_end|>

For Developers

A few lines of code.
JSON Tools. ChatML Format.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

  • Serverless: scales to zero, scales to millions
  • Pay per token, no minimums
  • Python and JavaScript SDKs, plus REST API
import requests
response = requests.post(
"https://modelslab.com/api/v7/llm/chat/completions",
json={
"key": "YOUR_API_KEY",
"prompt": "",
"model_id": ""
}
)
print(response.json())

FAQ

Common questions about NousResearch: Hermes 2 Pro - Llama-3 8B

Read the docs

Hermes 2 Pro is an 8B Llama-3 fine-tune by NousResearch. It upgrades Hermes 2 with function calling and JSON datasets. Excels in general tasks plus 90% function calling accuracy.

Uses special ChatML with <tools>, <tool_call> tokens as single tokens for streaming. Supports multi-turn via system prompts. Scores 90% on Fireworks.AI eval.

Supports 8192 input tokens and 8192 max output. Uses Llama3 tokenizer in ChatML format. Ideal for structured agent workflows.

Yes, scores 84% on structured JSON eval. Employs in-house dataset for reliable parsing. Adds tokens like <tool_response> for agents.

Hosted on Hugging Face and OpenRouter. GGUF quant versions available for llama.cpp. Check Novita or similar for API access.

Use ChatML with roles and special tokens. System prompts guide behavior. More structured than Alpaca for multi-turn dialogue.

Ready to create?

Start generating with NousResearch: Hermes 2 Pro - Llama-3 8B on ModelsLab.