Guardrails in the AI Gateway provide a mechanism to ensure safety, quality, and compliance by validating and transforming both the requests sent to and the responses received from Large Language Models (LLMs). This document outlines the configuration and application of guardrails within the TrueFoundry AI Gateway. Guardrails can be applied at two key stages in the AI Gateway workflow:
  1. Before Sending the Request to the LLM: These guardrails are called Input Guardrails and are applied to the request before it is sent to the LLM. Common usecases include:
    • Masking or redacting PII from the request,
    • Filtering or blocking requests containing prohibited topics or sensitive content
    • Validating input format and structure to prevent prompt injection or malformed data.
  2. After Receiving the Response from the LLM: These guardrails are called Output Guardrails and are applied to the response after it is received from the LLM. Common usecases include:
    • Filtering or blocking responses containing prohibited topics or sensitive content
    • Validating output format and structure to prevent hallucination or non-compliant data.
Guardrails can operate in two modes:
  1. Block / Validate request or response: In this mode, the guardrail checks either the request sent to the AI Gateway or the response received from the LLM against defined rules and if the rule is violated, the data is blocked and an error is returned to the client. This mode is called validate and is used to strictly enforce compliance and prevent unsafe or non-compliant data from being processed or returned at any stage.
  2. Modify / Mutate request or response: In this mode, the guardrail modifies the request before it is sent to the model, or alters the response after it is received from the LLM. For example, mutation can redact sensitive information, reformat data, or adjust content to meet policy requirements. This mode is called mutate in the Truefoundry platform.

Flow chart of how guardrails work in the AI Gateway

Guardrail Integrations

Truefoundry Gateway doesn’t provide any guardrails of its own. It instead integrates with the popular guardrail providers mentioned below to provide a unified interface for guardrail management and configuration.
In case you don’t see the provider you are looking for, please reach out to us and we will be happy to add the integration.

Configure Guardrails

Guardrails cane be configured both at a per request level and at the central gateway layer. To configure at the per request level, you can use the X-TFY-GUARDRAILS header. You can copy the code from the Playground Section. To configure at the gateway layer, you will need to create a YAML file that which guardrails need to be applied to which subset of the requests. You can decide the subset of requests based on the user making the request, model or any other metadata in the request. You can read about it more in the Configure Guardrails section.