Skip to main content
Budget limiting helps you control spending on LLM workloads by setting cost boundaries and automatically blocking requests when limits are exceeded.
Why use budget limiting?
  • Prevent runaway costs from bugs, infinite loops, or unexpected high-volume usage
  • Enforce budget allocations for teams, users, or projects
  • Get alerts when usage approaches limits
  • Track spending with real-time usage monitoring

Configuration Structure

name: budget-limiting-config
type: gateway-budget-config
rules:
  - id: 'rule-1'
    when: { ... }
    limit_to: 100
    unit: cost_per_day
    budget_applies_per: ['user'] # Optional
    alerts: { ... } # Optional

How It Works

  1. Each request is evaluated against the first matching rule based on the when conditions
  2. If the matching rule exceeds its limit → Request is REJECTED
  3. If the matching rule is within limits → Request is ALLOWED
Budget limits reset at the start of each period (day/week/month) at UTC timing. Usage resets to zero at UTC midnight for daily, Monday UTC midnight for weekly, and 1st of month UTC midnight for monthly periods.

Rule Components

1. Rule ID (id)

Unique identifier for the rule. Used in logs, metrics, and API responses.

2. Rule Conditions (when)

Defines which requests the rule applies to. All conditions use AND logic. Available Filters:
  • subjects: Filter by users, teams, or virtual accounts
    subjects:
      - 'user:alice@example.com'
      - 'team:engineering'
      - 'virtualaccount:acct_1234567890'
    
  • models: Filter by model names
    models:
      - 'openai-main/gpt-4'
      - 'anthropic-main/claude-3'
    
  • metadata: Filter by custom metadata from X-TFY-METADATA header
    metadata:
      environment: 'production'
      project_id: 'proj-123'
    

3. Budget Limit (limit_to + unit)

Defines the spending limit and time period. Supported Units:
  • cost_per_day - Daily budget limit
  • cost_per_week - Weekly budget limit
  • cost_per_month - Monthly budget limit
Budget Periods:
  • Daily: Resets at UTC midnight
  • Weekly: Resets on Monday at UTC midnight
  • Monthly: Resets on the 1st of each month at UTC midnight

4. Budget Applies Per (budget_applies_per) - Optional

Creates separate budget instances for each unique value of the specified entity. This allows you to set individual budget limits for each user, model, or other entity without creating separate rules. How it works:
  • Without budget_applies_per: One budget limit applies to all matching requests
    • Example: All users share a single $1000 daily budget
  • With budget_applies_per: ['user']: A separate budget is created for each user
    • Example: User Alice has a 500 Dollar daily budget, User Bob has a separate 500 Dollar daily budget
  • With budget_applies_per: ['model']: A separate budget is created for each model
  • With budget_applies_per: ['metadata.project_id']: A separate budget is created for each project ID value
Allowed Values:
  • user - One budget per user
  • virtualaccount - One budget per virtual account
  • model - One budget per model
  • metadata.* - One budget per custom metadata value (replace * with your metadata key)
Example: If you set limit_to: 500 and budget_applies_per: ['user']:
  • User Alice has a $500 daily budget
  • User Bob has a separate $500 daily budget
  • Each user’s usage is tracked independently
Maximum 1 value per rule. Cannot combine multiple entities.

Budget Alerts

Budget alerts notify you when usage crosses specified thresholds, helping you stay informed and take action before limits are exceeded.

Alert Configuration

alerts:
  thresholds: [75, 90, 95, 100] # Select from available thresholds
  notification_target:
    - type: email
      notification_channel: 'my-email-channel'
      to_emails: ['admin@example.com']

Alert Thresholds

Select from the available percentage thresholds at which alerts should be triggered.
  • Available thresholds: 75, 90, 95, 100
  • One-time per period: Each threshold triggers once per budget period
Add Budget Rule in AI Gateway Configuring Alerts for Budget Rule in AI Gateway
Threshold selection examples:
  • [75, 90, 100] - Early warning, critical, and limit reached
  • [90, 95, 100] - Focus on critical alerts only
  • [100] - Only alert when limit is reached

Notification Channels

Choose how you want to receive budget alerts: Email Notifications:
notification_target:
  - type: email
    notification_channel: 'email-channel-name'
    to_emails: ['admin@example.com', 'finance@example.com']
Slack Webhook:
notification_target:
  - type: slack-webhook
    notification_channel: 'slack-webhook-channel-name'
Slack Bot:
notification_target:
  - type: slack-bot
    notification_channel: 'slack-bot-channel-name'
    channels: ['#engineering-alerts', '#finance-alerts']
Only 1 notification channel per rule. To send alerts to multiple channels, create separate rules.

Alert Behavior

  • Automatic checking: Alerts are checked every 20 minutes
  • Period-based: Each threshold alert is sent once per budget period
  • Reset on new period: When a new period starts (day/week/month), alerts reset and can be sent again

Viewing Budget Usage

You can monitor budget usage directly on the budget configuration page. Each rule card displays:
  • Current usage amount and percentage
  • Budget limit and remaining budget
  • Period start time (when the current budget period began)
For rules with budget_applies_per, you can see usage breakdown for each rule. Budget Usage Per Rule in AI Gateway

Examples

Simple budget rules for specific users, teams, or virtual accounts.
name: basic-budget-config
type: gateway-budget-config
rules:
  # Daily limit for a specific user on a specific model
  - id: 'bob-gpt4-daily'
    when:
      subjects: ['user:bob@example.com']
      models: ['openai-main/gpt-4']
    limit_to: 50
    unit: cost_per_day

  # Monthly limit for a team across all models
  - id: 'backend-monthly'
    when:
      subjects: ['team:backend']
    limit_to: 2000
    unit: cost_per_month

  # Weekly limit for a virtual account
  - id: 'va1-weekly'
    when:
      subjects: ['virtualaccount:acct_1234567890']
    limit_to: 1000
    unit: cost_per_week
Automatically create separate budgets for each user, model, or metadata value.
name: per-entity-budgets
type: gateway-budget-config
rules:
  # Per-user daily budgets (automatically created for each user)
  - id: 'user-daily-budget'
    when: {}
    limit_to: 500
    unit: cost_per_day
    budget_applies_per: ['user']

  # Per-model weekly budgets (automatically created for each model)
  - id: 'model-weekly-budget'
    when: {}
    limit_to: 2000
    unit: cost_per_week
    budget_applies_per: ['model']

  # Per-virtual account weekly budgets (automatically created for each virtual account)
  - id: 'va-weekly-budget'
    when: {}
    limit_to: 1000
    unit: cost_per_week
    budget_applies_per: ['virtualaccount']

  # Per-project budgets using metadata
  - id: 'project-daily-budget'
    when: {}
    limit_to: 100
    unit: cost_per_day
    budget_applies_per: ['metadata.project_id']
When using budget_applies_per: ['metadata.project_id'], include X-TFY-METADATA: {"project_id": "proj-123"} in requests.
Configure notifications when budget usage crosses thresholds.
name: budget-with-alerts
type: gateway-budget-config
rules:
  # Team budget with email alerts
  - id: 'team-monthly-budget'
    when:
      subjects: ['team:engineering']
    limit_to: 5000
    unit: cost_per_month
    alerts:
      thresholds: [75, 90, 100]
      notification_target:
        - type: email
          notification_channel: 'team-alerts-channel'
          to_emails: ['team-lead@example.com']

  # Per-user budget with Slack alerts
  - id: 'user-daily-budget'
    when: {}
    limit_to: 100
    unit: cost_per_day
    budget_applies_per: ['user']
    alerts:
      thresholds: [90, 95, 100]
      notification_target:
        - type: slack-bot
          notification_channel: 'budget-alerts-channel'
          channels: ['#engineering-alerts']
Advanced configurations combining multiple filters and features.
name: comprehensive-budget-config
type: gateway-budget-config
rules:
  # Specific user + model combination
  - id: 'bob-gpt4-daily'
    when:
      subjects: ['user:bob@example.com']
      models: ['openai-main/gpt-4']
    limit_to: 50
    unit: cost_per_day

  # Team-wide monthly limit with alerts
  - id: 'backend-team-monthly'
    when:
      subjects: ['team:backend']
    limit_to: 2000
    unit: cost_per_month
    alerts:
      thresholds: [75, 90, 100]
      notification_target:
        - type: email
          notification_channel: 'team-alerts'
          to_emails: ['backend-lead@example.com']

  # Per-user budgets (all models)
  - id: 'per-user-daily'
    when: {}
    limit_to: 500
    unit: cost_per_day
    budget_applies_per: ['user']

  # Per-model budgets (all users)
  - id: 'per-model-weekly'
    when: {}
    limit_to: 1000
    unit: cost_per_week
    budget_applies_per: ['model']

  # Project-based budgets with environment filter
  - id: 'project-daily'
    when:
      metadata:
        environment: 'production'
    limit_to: 200
    unit: cost_per_day
    budget_applies_per: ['metadata.project_id']
    alerts:
      thresholds: [90, 100]
      notification_target:
        - type: slack-webhook
          notification_channel: 'prod-alerts-channel'