AI Guardrails & Policies

Manage guardrails and policies that protect prompts and completions across your organization

AI Guardrails & Policies

Control what goes into and comes out of your AI agents with guardrails — safety rules that mask sensitive data or block risky prompts before they reach the model.


Overview

Guardrails are safety rules enforced by the LLM proxy on every prompt and completion. They run automatically — no changes to your agents or workflows are required. You manage them from Security SettingsOpen Guardrails in the dashboard.

The guardrails page has eight tabs:

TabPurpose
My OrgShows which policies are currently attached to your organization
TemplatesPre-built policy templates you can apply with one click
All PoliciesEvery policy defined in your environment — create, delete, or attach
All GuardrailsIndividual guardrail rules — create, edit, or delete
TestingTest guardrails against sample text before rolling them out
MonitoringView request counts, fail rates, latency, and per-request logs
HelpIn-product reference for concepts, workflows, and terminology
SettingsPermission controls for override requests

Key Concepts

Guardrail

A single safety rule with one or more patterns and an action. Each pattern detects something specific — email addresses, phone numbers, credit card numbers, or a custom regex you provide. The action determines what happens when a match is found.

Policy

A reusable bundle of guardrails. You attach a policy to your organization so its guardrails run on every LLM request from your team members. You can create multiple policies for different compliance requirements.

Attachment

The link between a policy and your organization. Detaching a policy stops its guardrails from running, but keeps the policy available for re-attachment later.

Actions: Mask vs Block

ActionBehaviorWhen to Use
MaskReplaces matched text with a placeholder (e.g., [EMAIL_REDACTED]) and sends the rewritten prompt to the model. No tokens are wasted.PII categories where the user can still get a useful answer without the redacted information
BlockRejects the request entirely before it reaches the model. The user sees an error with the rule name.Compliance-critical patterns like SSNs, full credit card numbers, or confidential data

Most teams default to Mask for PII categories and reserve Block for compliance-critical patterns.

Applying a Template

  1. Open the Templates tab.
  2. Click a template card to preview its guardrails.
  3. Use the per-guardrail checkboxes to opt out of specific rules if needed.
  4. Click Create N Guardrails & Use Template. This creates the guardrails and a matching policy.
  5. Switch to the My Org tab and attach the new policy.

Building a Custom Policy

  1. Open the All Guardrails tab and confirm the guardrails you need exist. Create new ones with Create Guardrail if not.
  2. Switch to All Policies and click Create Policy.
  3. Name the policy (e.g., finance-pii-strict) and add guardrails as tags.
  4. Optionally scope the policy to specific models with model conditions.
  5. Save, then attach the policy from the My Org tab.

Testing Guardrails

Before rolling out a guardrail, validate it from the Testing tab:

  • Single guardrail test — Choose a guardrail, paste a sample prompt, and click Run test. The before/after panes show exactly what the model would receive.
  • End-to-end test — Select multiple guardrails and test with several prompts at once.
  • Policy resolution preview — Confirm which guardrails fire for a specific model and tag combination.

Monitoring

The Monitoring tab shows:

  • Total requests evaluated, healthy guardrails, and any with failures
  • Per-guardrail metrics: requests evaluated, fail rate, average latency, and trend
  • Per-request usage logs with timestamps, action (passed/blocked/flagged), score, latency, and model

Click the eye icon on any guardrail row to drill into its per-request logs.

Permissions

Only team members with the guardrails:manage capability can create, edit, delete, or attach guardrails and policies. The Settings tab lets admins control whether members can request permission overrides.

Data Flow

  1. A user sends a prompt from chat or an agent.
  2. The proxy checks every guardrail attached to the organization via attached policies.
  3. Mask rewrites matched text in place; Block rejects the request.
  4. The model only sees the masked content — the original is never persisted in usage logs.
  5. Each evaluation is recorded and visible from the Monitoring tab.