Introduction
Current LLMs can now reason, code, and generate with incredible fluency in isolated scenarios However, they don’t know your customer, your inventory, your workflows and your data. To realize the full potential of AI agents, they need access to relevant contextual data from diverse sources like personal files, knowledge repositories, and external tools. Furthermore, these agents must be capable of acting on that context, such as modifying documents or sending emails.
Previously, integrating AI models with these varied external resources was a complex and inefficient process. Developers often relied on bespoke code or specialized plugins tailored to each specific data source or API, resulting in fragile and unscalable integrations.
Anthropic introduced the Model Context Protocol (MCP), an open standard intended to connect AI assistants to a broader ecosystem of data and tools. Announced in November 2024, MCP has now gained significant traction and we are seeing a lot of softwares publishing their MCP servers.
What is MCP and how does it work?
MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools. - From Anthropic docs
MCP Servers are programs that expose data and capabilities to LLMs via the MCP protocol.
For example, a Slack MCP server will expose the following capabilities: Send a message to a channel
, Search for messages in a channel
, Get the list of channels
, Get the list of users
and more.
A Github MCP server will expose the following capabilities: Get the list of repositories
, Get the list of issues
, Get the list of pull requests
, Create a pull request on a repository
and more.
If an LLM is provided access to a Slack and Github MCP server, we can create agents very easily by prompting to the LLM:
Get open pull requests on my repository
test-repo
and send me a slack message with the list of pull requests.
If we provide Github and MCP servers to the MCP client, it will automatically be able to call the model, get the tool calls, execute the tool calls to finally acheive the goal of sending me the slack message with the list of pull requests.
You can also create your own MCP servers to expose your data and internal APIs as tools to LLMs.
Democratizing MCP Access and AI agentic workflows
The above concept of MCP servers presents a really powerful system wherein it becomes really easy for everyone inside an organization to build agents really fast and explore a wide range of automations. However, to enable a system like this, we need to solve the following problems:
Central Registry of MCP servers
Central Registry of MCP servers
We need to a registry where all the MCP servers in an organization are registered and can be discovered. This requires us to either deploy the MCP servers or refer to an existing already hosted MCP server like the one by Atlassian.
Access Control
Access Control
We should be able to define which teams / users / applications can access which MCP servers or tools. This is crucial since there might be some MCP servers on sensitive data that should be accessible to only a few set of applications.
Authentication / Authorization
Authentication / Authorization
Agent Registry
Agent Registry
With the MCP server registry in place, it will be really easy for folks to build agents just by entering a prompt and a set of tools. Its important for developers to be able to register their agents so that others can try and use them.
Guardrails
Guardrails
Its important to put guardrails like user approval before executing certain tools in MCP servers. For e.g. any delete tool should require explicit user approval before executing. Some MCP server shouldn’t be able to get access to sensitive data and so the data should pass through a PII filter before the MCP server can access the data.
Observability
Observability
Once we have a system in place, we will need detailed analytics on how the models and MCP servers are being used, latency of each of them and the cost incurred.
Rate Limiting / Caching
Rate Limiting / Caching
Sometimes LLMs can enter loops or end up calling certain tools very frequently. Also, tools like websearch can benefit a lot from caching. So it’s important to have systems for rate limiting and caching of tool reponses in place.
Introducing the Truefoundry MCP Registry in AI Gateway
Truefoundry AI gateway comes with a MCP registry, centralized authentication and a MCP client built into the gateway that can orchestrate the agentic loop between the LLM and the MCP servers.
The key functionalities available are:
-
Centralized MCP Registry: You can add public as well as your self hosted MCP servers which are registered in the Truefoundry Control Plane. The Control plane maintains the centralized registry of all the MCP servers and their authentication mechanisms. It handles user-specific OAuth2 flows, securely storing and refreshing access tokens and ensuring users can only access resources they are authorized for.
-
Access Control: While registering a MCP server, you can specify the list of users/teams that have access to the MCP server.
-
Unified Key to access all MCP servers: Any user can generate a single Personal Access Token (PAT) using which they can access all the models and MCP servers that they have access to. You can also generate a Virtual Account Token(VAT) to provide access to a specific set of MCP servers to an application.
-
Agent Playground: Truefoundry AI gateway provides a playground where in users can play with prompts and different tools of MCP servers to build agents. Truefoundry comes with commonly used tools like Websearch, WebScraping, document extraction, code execution. The gateway comprises of a MCP client that orchestrates executing the tools decided by the LLM providers.
The Gateway also streames the progress of the request back to the UI so that the user can see the LLM responses, tool calls and the tool responses.
- Use MCP Servers in Code: The Gateway also shows the code snippets using which you can start using the MCP servers in your code.