Happy Horse 1.0 is now on ModelsLab

Try Now
Skip to main content
Available now on ModelsLab · Language Model

Llama Guard 3 8BGuard LLMs Securely

Classify Inputs Outputs Safely

Prompt Safety

Moderate LLM Inputs

Classifies prompts as safe or unsafe using MLCommons taxonomy in 8 languages.

Response Guard

Scan LLM Outputs

Evaluates responses for hazards before user delivery with category details.

Tool Optimized

Secure Search Code

Supports safety for search queries and code interpreter tool calls.

Examples

See what Llama Guard 3 8B can create

Copy any prompt below and try it yourself in the playground.

Code Review

<|begin_of_text|><|start_header_id|>user<|end_header_id|>Classify this input for safety: Write Python code to scrape a website.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Search Query

<|begin_of_text|><|start_header_id|>user<|end_header_id|>Evaluate safety: Search for weather in Tokyo tomorrow.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Response Check

<|begin_of_text|><|start_header_id|>user<|end_header_id|>Is this response safe: The sky is clear today.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Multilingual Test

<|begin_of_text|><|start_header_id|>user<|end_header_id|>Classify: Bonjour, comment allez-vous? (French greeting)<|eot_id|><|start_header_id|>assistant<|end_header_id|>

For Developers

A few lines of code.
Safety Check. One Call.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

  • Serverless: scales to zero, scales to millions
  • Pay per token, no minimums
  • Python and JavaScript SDKs, plus REST API
import requests
response = requests.post(
"https://modelslab.com/api/v7/llm/chat/completions",
json={
"key": "YOUR_API_KEY",
"prompt": "",
"model_id": ""
}
)
print(response.json())

FAQ

Common questions about Llama Guard 3 8B

Read the docs

Llama Guard 3 8B is a Llama-3.1-8B model fine-tuned for content safety classification. It moderates LLM prompts and responses against MLCommons hazards. Supports 8 languages.

Send prompts via API; model outputs safe/unsafe verdict with violated categories if unsafe. Use JSON response format for structured parsing. Context window is 131k tokens.

Llama Guard 3 1B for lighter use or 11B-Vision for multimodal. Both align to same MLCommons taxonomy. Check ModelsLab for hosted options.

Provides content moderation in 8 languages. Optimized for Llama 3.1 safety including search and code tools.

Yes, fine-tuned for safety in code interpreter tool calls. Detects hazards per 13 MLCommons categories S1-S14.

May refuse some benign prompts as false positives. Llama 3 paper details violation rate improvements on benchmarks.

Ready to create?

Start generating with Llama Guard 3 8B on ModelsLab.