- Before Sending the Request to the LLM: These guardrails are called Input Guardrails and are applied to the request before it is sent to the LLM. Common usecases include:
- Masking or redacting PII from the request,
- Filtering or blocking requests containing prohibited topics or sensitive content
- Validating input format and structure to prevent prompt injection or malformed data.
- After Receiving the Response from the LLM: These guardrails are called Output Guardrails and are applied to the response after it is received from the LLM. Common usecases include:
- Filtering or blocking responses containing prohibited topics or sensitive content
- Validating output format and structure to prevent hallucination or non-compliant data.
- Block / Validate request or response: In this mode, the guardrail checks either the request sent to the AI Gateway or the response received from the LLM against defined rules and if the rule is violated, the data is blocked and an error is returned to the client. This mode is called
validate
and is used to strictly enforce compliance and prevent unsafe or non-compliant data from being processed or returned at any stage. - Modify / Mutate request or response: In this mode, the guardrail modifies the request before it is sent to the model, or alters the response after it is received from the LLM. For example, mutation can redact sensitive information, reformat data, or adjust content to meet policy requirements. This mode is called
mutate
in the Truefoundry platform.

Flow chart of how guardrails work in the AI Gateway
Guardrail Integrations
Truefoundry Gateway doesn’t provide any guardrails of its own. It instead integrates with the popular guardrail providers mentioned below to provide a unified interface for guardrail management and configuration.
OpenAI Moderations
Integrate with OpenAI’s moderation API to detect and handle content that may violate usage policies, like violence, hate speech, or harassment.
AWS Bedrock Guardrail
Integrate with AWS Bedrock’s capabilities to apply guardrails on AI models.
Azure PII
Integrate with Azure’s PII detection service to identify and redact PII data in both requests and responses.
Azure Content Safety
Leverage Azure Content Safety to detect and mitigate harmful, unsafe, or inappropriate content in model inputs and outputs.
Enkrypt AI
Integrate with Enkrypt AI for advanced moderation and compliance, detecting risks like toxicity, bias, and sensitive data exposure.
Palo Alto Prisma AIRS
Integrate with Palo Alto AI Risk to detect and mitigate harmful, unsafe, or inappropriate content in model inputs and outputs.
PromptFoo
Integrate with Promptfoo to apply guardrails like content moderation on the models.
Fiddler
Integrate with Fiddler to apply guardrails, such as Fiddler-Safety and Fiddler-Response-Faithfulness on the models.
Bring Your Own Guardrail
Integrate your own custom guardrail using frameworks like Guardrails.AI or a python function.
Configure Guardrails
Guardrails cane be configured both at a per request level and at the central gateway layer. To configure at the per request level, you can use theX-TFY-GUARDRAILS
header. You can copy the code from the Playground Section.
To configure at the gateway layer, you will need to create a YAML file that which guardrails need to be applied to which subset of the requests. You can decide the subset of requests based on the user making the request, model or any other metadata in the request. You can read about it more in the Configure Guardrails section.