Configure Gateway
Budget Limiting
Description of your new file.
Cost budgeting is a critical feature for effectively managing LLM workloads. It allows organizations to set and enforce cost boundaries across teams, users, and accounts—helping ensure operational efficiency and financial control.
- Prevent Runaway Costs: Protect against unexpected cost spikes due to code bugs, infinite loops, or high-volume usage. This safety mechanism prevents a single error from resulting in significant unplanned expenses.
- Enforce Budget Allocations: Set specific spending limits for teams, users, or virtual accounts. This ensures each group stays within their allocated budget and provides visibility into consumption patterns.
- Financial Planning: Make LLM costs predictable and manageable for business planning. With defined budgets, finance teams can forecast expenses with greater accuracy and avoid end-of-month surprises
Configure Budget Limiting in TrueFoundry AI Gateway
Using the budget limting feature, you cansets of requests set budgets to a specificed cost per day/month for requests . The budget limiting configuration is defined as a YAML file which has the following fields:
- name: The name of the budget limiting configuration - it can be anything and is only used for reference in logs.
- type: This should be
gateway-budget-config
. It helps TrueFoundry identify that this is a budget limiting configuration file. - rules: An array of rules.
The budget limiting configuration contains an array of rules. Every request is evaluated against the set of rules, and all the matching rules are applied - if any of them exceeds limit, then error is thrown.
For each rule, we have four sections:
- id: A unique identifier for the rule. Only used for reference in logs and metrics.
- You can use dynamic values like
{user}
,{model}
which will be replaced by actual user or model in the request.- If you set the ID as
{user}-daily-limit
, the system will create a separate rule for each user (for example, alice-daily-limit, bob-daily-limit) and apply the limit individually to each one. - If you set the ID as just daily-limit (without placeholders), the rule will apply collectively to the total number of requests from all users included in the when block.
- If you set the ID as
- You can use dynamic values like
- when (Define the subset of requests on which the rule applies): TrueFoundry AI gateway provides a very flexible configuration to define the exact subset of requests on which the rule applies. We can define based on the user calling the model, or the model name or any of the custom metadata key present in the request header
X-TFY-METADATA
. The subjects, models and metadata fields are conditioned in an AND fashion - meaning that the rule will only match if all the conditions are met. If an incoming request doesn’t match the when block in one rule, the next rule will be evaluated.subjects
: Filter based on the list of users / teams / virtual accounts calling the model. User can be specified usinguser:john-doe
orteam:engineering-team
orvirtual-account:acct_1234567890
.models
: Rule matches if the model name in the request matches any of the models in the list.metadata
: Rule matches if the metadata in the request matches the metadata in the rule. For e.g. if we specifymetadata: {environment: "production"}
, the rule will only match if the request has the metadata keyenvironment
with valueproduction
in the request headerX-TFY-METADATA
.
- limit_to: Integer value which along with unit specifies the limit (for e.g. 1000 dollars a month)
- unit: Possible values are
cost_per_day, cost_per_month
Sample Configuration
How Budget Evaluation Works
- Each incoming request is evaluated against all rules.
- Rules are matched based on the
when
block. If multiple rules match, all applicable rules are enforced. - If any matching rule has exceeded its limit, the request is rejected with an error.