Canary Testing

You can use TrueFoundry’s AI Gateway to canary test new models or prompts across different environments. This approach leverages load balancing techniques to achieve a different goal: testing new versions or configurations with a small portion of traffic before full deployment.

Example: Test aws/bedrock-llama3 on 10% of the traffic

Suppose you want to introduce aws/bedrock-llama3 into your system but are unsure of its impact. You can create a configuration specifically for testing Llama3 in a staging environment before deploying it to production.

The configuration specification would look like this:

name: canary-test-config
type: gateway-load-balancing-config
# The rules are evaluated in order and once a request matches one rule, 
# the subsequent rules are not checked
rules:
  # Distribute traffic on staging environment with 90% to openai/gpt4 and 10% to aws/bedrock-llama3
  - id: "openai-gpt4-dev-env"
    when:
      metadata:
        env: staging
    load_balance_targets:
      - target: "openai/gpt4"
        weight: 90
      - target: "aws/bedrock-llama3"
        weight: 10

In this configuration, we instruct the gateway to route 10% of the traffic to the aws/bedrock-llama3 model. TrueFoundry manages all necessary request transformations, so there’s no need to modify your existing code.

You can easily set this up as a rule in your load balancing configuration, as shown here

Once the data begins flowing, you can use TrueFoundry’s analytics dashboards to monitor the impact of the new model on metrics such as cost, latency, errors, and feedback.