Guardrails
Guardrails allow you to validate and transform LLM inputs and outputs to ensure safety, quality, and compliance.
Config
The guardrails configuration contains an array of rules that are evaluated for each request. Only the first matching guardrail rule is applied to that request. Each rule can specify input and output guardrails that will be applied. Let’s take a look at a sample configuration first.
Example Configuration
This sample guardrail has one rule that has one input guardrail that masks PIIs and two output guardrails - one for masking PII and other for failing the request if the LLM responds with any of the denied topics. It also has a when
block, so only specific requests have these guardrails applied on them.
Rules
For each rule, we have three sections:
-
id
: A unique identifier for the rule -
when
: Defines the subset of requests on which the rule applies. If set to an empty object ({}
), the rule applies to all requests.- subjects: List of users, teams, or virtual accounts that originate the request. Specify as
user:email@example.com
,team:team-name
, orvirtual-account:account-name
. - models: List of model identifiers to match. These should correspond to the model names used in the request’s
model
field. - metadata: Key-value pairs to match against request metadata. For example, specifying
metadata: { environment: "production" }
will apply the rule only if the request includes the headerX-TFY-METADATA
withenvironment=production
.
- subjects: List of users, teams, or virtual accounts that originate the request. Specify as
-
input_guardrails
: An array of fqn of guardrail integrations to apply to the input prompt -
output_guardrails
: An array of fqn of guardrail integrations to apply to the LLM response
How to get the fqn of guardrail integrations
You can get the fqn of guardrail integrations by navigating to the Guardrail tab on AI Gateway and clicking on the “Copy FQN” button next to the guardrail integration.
Copy FQN