Skip to main content
To get started with integrating and using Guardrails, follow the steps below:
1

Create a Guardrails Group

A guardrails group serves as a container for multiple guardrail integrations. You can assign who can add/edit/remove guardrails in a group and who can access the guardrails in a group. There are two type of roles for collaborators:
  • Manager: Can create, edit, delete and use the guardrail integrations in the group.
  • User: Can use the guardrail integrations in the group.
To create a guardrails group, navigate to the AI Gateway -> Guardrail. Give your group a name. In our case, we’ve named it openai-guardrail.
A Guardrail Group is the minimum level of access control for guardrails.A common usage pattern is to create an org-wide global guardrails group which contains the organization wide guardrails. The Platform team is the manager and the entire org are users of the guardrail.For team/product specific guardrails, its better to create individual guardrail groups per product / team.
2

Add Guardrail Integration in Guardrails Group

You can add integrations to your guardrails for content moderation and safety. Let’s create an OpenAI guardrail integration, which we will later use as an output guardrail in our configuration example. This will allow us to automatically moderate and filter responses from the LLM using OpenAI’s moderation capabilities.
  1. Select the OpenAI Moderation from the list of integrations.
    Guardrail integration selection interface showing OpenAI Moderation option
  2. Fill in the OpenAI Moderation Guardrail Config form. For this tutorial, we’ve named it tfy-openai-moderation.
    OpenAI Moderation configuration form with fields for name, API key, and operation type
  3. Save the guardrail group.
You are now ready to use this OpenAI guardrail integration in your guardrail configuration to moderate LLM inputs. This means the guardrail will be applied to the input prompts before they are sent to the LLM, helping to ensure that any unsafe or non-compliant content is detected and handled prior to model processing.
To customize moderation sensitivity for specific categories such as harassment, sexual, or hate, enable the Category Thresholds option. You can then adjust the threshold values for each category according to your requirements.
3

Test Guardrails in Playground

The Playground allows you to test guardrails on all four hooks before deploying to production.

Testing LLM Input/Output Guardrails

  1. Navigate to the Playground
    Navigate to the AI Gateway → Playground Tab.
  2. Select Guardrails
    On the left side, you’ll see options for LLM Input Guardrails and LLM Output Guardrails.
    • Click on either option depending on which hook you want to test.
    • Add the guardrail you want to apply by selecting from the list.
    As we have already created an OpenAI guardrail named tfy-openai-moderation in the previous steps, select this guardrail under LLM Input Guardrails.
    AI Gateway playground interface with guardrail selection panel on the left side
  3. Test the Guardrail
    • Enter a prompt that would typically be flagged as unsafe or offensive.
    • Send the request to the model.
    You should see that the prompt is blocked by the guardrail, demonstrating that your configuration is working as expected.

Testing MCP Tool Guardrails

Once you have configured guardrail rules in AI Gateway → Controls → Guardrails, you can test MCP Pre Tool and Post Tool guardrails from the Playground:
  1. Configure Guardrail Rules First
    Navigate to AI Gateway → Controls → Guardrails and create rules that target your MCP servers/tools with mcp_tool_pre_invoke_guardrails or mcp_tool_post_invoke_guardrails.
  2. Use the Playground with MCP Tools
    In the Playground, when you invoke MCP tools (via agents or tool calls), the configured guardrails will automatically execute on the Pre Tool and Post Tool hooks.
  3. View Results in Traces
    After execution, navigate to Monitor → Request Traces to see detailed information about which guardrails were triggered on each MCP tool invocation, including:
    • Pre Tool guardrail validation results
    • Post Tool guardrail findings and mutations
    • Pass/fail status for each guardrail
Testing in the Playground is an easy way to validate that your guardrails are correctly configured before deploying to production. For MCP guardrails, use the Request Traces view to verify guardrails are triggering as expected on tool invocations.
4

Trigger Guardrails in LLM requests from code

You can pass guardrails as headers at a per-request level by providing the X-TFY-GUARDRAILS header. You can copy the relevant code snippet from the Playground section.
  1. Select the Guardrail to Apply
    On the left side, choose the LLM Input Guardrails and LLM Output Guardrails sections, and select the guardrails you want to apply. For this tutorial, we’ve selected the tfy-openai-moderation guardrail for both input and output.
    Guardrail selector in playground showing selected OpenAI moderation guardrail
  2. Get the Code Snippet
    After configuring your request and selecting the desired guardrail(s), click the Code button at the top right of the Playground.
    In the code snippet section, you will see ready-to-use examples for different SDKs and curl.
    Note: The generated code will automatically include the necessary X-TFY-GUARDRAILS header with your selected guardrails.
    Code snippet section showing language options for implementing guardrails
  3. Copy and Run the Code
    Copy the generated curl command (or code for your preferred SDK) and run it.
    The guardrails you selected in the Playground will be applied automatically, as reflected in the request headers.
    Curl command example with X-TFY-GUARDRAILS header for applying guardrails

X-TFY-GUARDRAILS Header Format

The header accepts JSON with the following fields:
{
  "llm_input_guardrails": ["group/guardrail-name"],
  "llm_output_guardrails": ["group/guardrail-name"],
  "mcp_tool_pre_invoke_guardrails": ["group/guardrail-name"],
  "mcp_tool_post_invoke_guardrails": ["group/guardrail-name"]
}
You can use this approach with any HTTP client or SDK by adding the X-TFY-GUARDRAILS header to your request. This allows you to dynamically apply guardrails per request without changing your global configuration.
For backward compatibility, input_guardrails maps to llm_input_guardrails and output_guardrails maps to llm_output_guardrails.
5

Configure application of guardrails at gateway layer

  1. Navigate to Guardrail Rules
    Go to AI Gateway → Controls → Guardrails to access the Guardrail Rules page.
  2. Create a Guardrail Configuration
    Click on Create/Edit Guardrail Config. Fill in the required details for your guardrail configuration:
    • Define when conditions to target specific users, models, metadata, or MCP servers
    • Configure guardrails for any of the four hooks:
      • llm_input_guardrails - Before LLM request
      • llm_output_guardrails - After LLM response
      • mcp_tool_pre_invoke_guardrails - Before MCP tool invocation
      • mcp_tool_post_invoke_guardrails - After MCP tool returns
  3. Save the Configuration
    After filling out the form, click Save to apply your guardrail configuration.
Gateway-level guardrail configuration interface with YAML editor for rules
For more details on how to configure guardrails, see the Guardrails Configuration Guide.
Configuring guardrails at the gateway layer is recommended for organization-wide enforcement. This centralizes guardrail management and auditing, eliminating the need to set headers on every request. MCP tool guardrails are especially important for securing agentic workflows.