Cross Region Inference

Cross-Region Inference

To manage traffic during on-demand inferencing by utilising compute across regions, you can use AWS Bedrock Cross-Region Inference. While setting model ID in TrueFoundry, use the Inference Profile ID instead of the model ID.

You can more information about cross-region inferencing here.

Use Inference Profile ID as model ID while adding model to TFY AI Gateway

How Cross-Region Inference Works

AWS Bedrock supports cross-Region inference using system-defined inference profiles, which help manage unexpected traffic spikes by distributing inference workloads across multiple AWS Regions. This lets you maintain availability and performance by leveraging compute resources beyond the region where the request originated.

For example

Cross-Region Inference

As we can see in the above example, we can wee that model us us.anthropic.claude-sonnet-4-20250514-v1:0 is availabile in us-west-2, us-east-1, us-east-2 regions. When you make a request to this model from any of these regions (us-west-2, us-east-1, us-east-2), AWS Bedrock can automatically reroute the request to any of the other destination regions based on factors like:

Current load and capacity in each region
Network latency and performance
Regional availability and health

This automatic routing helps ensure optimal performance and reliability by distributing traffic across multiple regions, even if the original request region is experiencing high load or other issues.

If you want to know more about regions supported by AWS Bedrock for models, you can check here.

Access Control

When using system-defined inference profiles for cross-Region routing, you must ensure proper access control configuration:

Grant model access permissions in all destination Regions where traffic may be routed for inference
Access permissions in just the source Region are not sufficient

This is important because AWS Bedrock may dynamically route requests to any of the configured destination Regions based on factors like load and availability. Without proper permissions in all potential destination Regions, some inference requests may fail even if the source Region permissions are correctly set up.

To give access to the model in a region, you need to add the model in that region. For example: Below is the model access page in eu-central-1 region. https://eu-central-1.console.aws.amazon.com/bedrock/home?region=eu-central-1#/modelaccess

Authentication Overview

On this page

Cross-Region Inference
How Cross-Region Inference Works
For example
Access Control

Get Started

Developer Guide

MCP Registry and Gateway

Configure Gateway

Integrations

Observability

Deployment

API Reference

Chat

Agent Responses

Embeddings

Rerank

Responses

Audio

Batch

Files

Moderations