Guardrail Hooks
Guardrails can be invoked at four different points (hooks) in the AI Gateway workflow:| Hook | When It Executes | Common Use Cases |
|---|---|---|
| LLM Input | Before the request is sent to the LLM | PII redaction, prompt injection detection, content moderation |
| LLM Output | After the response is received from the LLM | Hallucination detection, content filtering, secrets detection |
| MCP Pre Tool | Before an MCP tool is invoked | Validate tool parameters, check permissions, sanitize inputs |
| MCP Post Tool | After an MCP tool returns results | Validate tool outputs, detect unsafe code/SQL, redact sensitive data |
Guardrail Rules UI
Navigate to AI Gateway → Controls → Guardrails to view and manage guardrail rules. Click Add Rule to create a new rule or edit existing ones.- Database Protection Rule
- Code Executor Rule
Rule ID
WHEN REQUEST GOES TO + Add Targets
FROM SUBJECTS
APPLY ON HOOKS + Add Hook
database-safety-ruleWHEN REQUEST GOES TO + Add Targets
| MCP Servers | IN | database-tools analytics-db |
FROM SUBJECTS
| AND | IN | 👤 Alice Chen 👤 Bob Smith |
NOT IN | 👤 DB Admin |
APPLY ON HOOKS + Add Hook
| Hook | Guardrail |
|---|---|
| MCP Tool Pre-Invoke | my-guardrails/sql-sanitizer |
| MCP Tool Post-Invoke | my-guardrails/secrets-detection |
Create Guardrail Config
Create rules via the UI above (AI Gateway → Controls → Guardrails), or use YAML configuration via the Config tab.
Configuration Structure
The guardrails configuration contains an array of rules that are evaluated for each request. Only the first matching guardrail rule is applied to that request. Each rule can specify guardrails for any of the four hooks.Example Configuration
Configuration Reference
Rule Structure
| Field | Required | Description |
|---|---|---|
id | Yes | Unique identifier for the rule |
when | Yes | Matching criteria with target and subjects blocks |
llm_input_guardrails | Yes | Guardrails applied before LLM request (use [] if none) |
llm_output_guardrails | Yes | Guardrails applied after LLM response (use [] if none) |
mcp_tool_pre_invoke_guardrails | Yes | Guardrails applied before MCP tool invocation (use [] if none) |
mcp_tool_post_invoke_guardrails | Yes | Guardrails applied after MCP tool returns (use [] if none) |
The when Block
The when block contains two main sections: target (what the request targets) and subjects (who is making the request):
| Section | Description |
|---|---|
target | Defines conditions based on mcpServers, models, mcpTools, or metadata |
subjects | Defines conditions based on users, teams, or virtual accounts |
If
when is empty ({}), the rule matches all requests. Use this for fallback/default rules at the end of your rules list.The when Block Structure
Target: Match by MCP Servers
Target: Match by MCP Servers
Target: Match by Models
Target: Match by Models
Target: Match by Metadata
Target: Match by Metadata
X-TFY-METADATA: {"environment": "production", "tier": "enterprise"}Target: Match by Specific MCP Tool
Target: Match by Specific MCP Tool
Subjects: Users with IN/NOT IN
Subjects: Users with IN/NOT IN
Combined Target and Subjects
Combined Target and Subjects
target and subjects conditions must match for the rule to apply.- Rules are evaluated in order. Only the first matching rule is applied; subsequent rules are ignored for that request.
- Each rule can target specific users, teams, models, metadata, or MCP servers, and can enforce different guardrails on any combination of hooks.
- Omitted fields are not used for filtering (e.g., if
modelsis not specified, the rule matches any model).
How to Get the Guardrail Selector
You can get the selector (FQN) of guardrail integrations by navigating to the Guardrail tab on AI Gateway and clicking on the “Copy FQN” button next to the guardrail integration.
- LLM chat/completion requests (LLM Input/Output hooks)
- MCP tool invocations (MCP Pre/Post Tool hooks)
MCP Tool Guardrails Example
For agentic workflows using MCP tools, you can add guardrails that validate tool inputs before execution and sanitize outputs after:Monitoring Guardrail Execution
You can monitor guardrail execution in real-time through the AI Gateway dashboard:- Navigate to AI Gateway → Monitor → Request Traces
- View detailed traces showing which guardrails were triggered on each hook
- See findings, mutations, and execution timing for each guardrail
- Filter by guardrail status to quickly find blocked or flagged requests
Detecting Guardrail Violations Programmatically
When guardrails flag content, the Gateway returns400 status code with guardrail violation details. You can programmatically detect violations by checking the error.type field in the error response.
Error Response Format:
When a guardrail violation occurs, the response includes details about which hook triggered the violation:
The
error.type field will be set to 'guardrail_checks_failed' only when there is an actual guardrail violation. The guardrail_checks object will only contain the hooks that were evaluated.