Truefoundry Docs

v0.76.0

2025-08-07

Changes

Bug fixes and improvements

Release instructions:

Update truefoundry helm chart version to 0.76.0.

v0.75.4

2025-08-06

Improvements on Model deployment

Upgraded vLLM version to 0.10.0.
Added startup probe for model loading to reduce liveness probe failure.
Added support for text2text generation models.
Added support for mxfp4 models (gpt-oss).

More updates:

AI Gateway Requests and Metrics tab moved under Monitor sidebar.

Use tfy-secrets to deploy K8s secrets in your cluster for Helm charts. Read more
TrueFoundry SDK documentation available here

Release instructions:

Update truefoundry helm chart version to 0.75.4.

v0.74.2

2025-08-01

New: Security settings - enhanced control over personal access tokens

You can now configure maximum possible expiration time of personal access tokens and limit maximum number of personal access tokens per user across the organization.

Revamped personal access token page with new UI and better navigation.

Admins can revoke all PATs for a user at once in case of any security concern.

Release instructions:

Update truefoundry helm chart version to 0.74.2.
ServiceAccount Unification: We would be using single default serviceaccount truefoundry for all microservices. Add any existing labels or annotations to global.serviceAccount

v0.73.1

2025-07-29

We have revamped the navigation sidebar to make it more intuitive and easier to use.

More updates:

Improved UX for code snippet generation in AI Gateway.

Release instructions:

Update truefoundry helm chart version to 0.73.1.

v0.72.4

2025-07-28

New: AI Gateway playground chat history

You can now see the chat history in the AI Gateway playground. This will help you to easily see the chat history and continue the conversation.

More updates:

We have added support for Enkrypt AI in Guardrails.

Release instructions:

Update truefoundry helm chart version to 0.72.4.

v0.71.4

2025-07-24

Improvement: Copy code snippet for AI Gateway with pre-filled token.

You can now copy the code snippet for AI Gateway with pre-filled token. This will help you to easily copy and run the code snippet without having to manually fill the token.

Release instructions:

Update truefoundry helm chart version to 0.71.4.
Global global.config.storageConfiguration for all applications used to define blob storage configured for control plane.

v0.70.0

2025-07-22

New: Easily configure different settings for your organization

Configure SSO for authentication. We support SAML and OIDC with multiple providers like Google, Okta, Azure AD, etc. Read more
Configure different settings for AI features like common tools, troubleshoot, etc.

More updates:

Enhanced logs browsing experience in monitoring tab. Smooth scrolling and better navigation to Nearby logs.

v0.69.2

2025-07-17

New: Integrate and Use Guardrails in AI Gateway

You can now integrate and use Guardrails in AI Gateway. Use bedrock and openai moderation as direct integrations. For custom guardrails, you can use custom guardrails integrations. Read more

Integrate and manage RBAC for Guardrails at the guardrail group level.
Configure and enforce input and output guardrails per model, user, or team.
Combine multiple guardrail types (Bedrock, OpenAI Moderation, Custom) for flexible protection.
View and audit guardrail enforcement in the AI Gateway Playground.
Easily extend with custom guardrails.

More updates:

New: Add cluster level alerts

v0.68.2

2025-07-14

New: Logs Volume

You can now see logs volume in the logs tab. This helps you to understand frequency of logs and ease zooming into specific time range.

More updates:

AI Gateway: Added support for Rerank models in AWS Bedrock.
Autopilot: We can now detect issues in k8s Node and automatically fix them either by restarting the node or by draining the node.

v0.67.1

2025-07-04

Improvements and Bug fixes

Support for Troubleshoot in Services/Models. It is a tool that helps you to troubleshoot your services by providing you with a list of possible issues and solutions after analyzing the logs and metrics.
Enabled free tools like Web Search, Code Execution, Sequential Thinking in AI Gateway Playground

v0.66.1

2025-07-02

Improvements and Bug fixes

Support for Databricks embeddings models
Support OAuth authentication for Databricks model provider

v0.64.2

2025-06-27

Improvements and Bug fixes

Support global region in Google Vertex
Support integration with Claude code

v0.63.1

2025-06-24

New: Integrate and use MCP servers in AI Gateway

You can now integrate and use MCP servers in AI Gateway. Connect to hosted MCP servers or deploy your own MCP servers in the platform. Read more

Integrate and manage RBAC for MCP servers at organization level.
You can now select specific tools from different MCP server in AI Gateway Playground.
Deploy your own MCP servers with your own custom tools.
Easily integrate with MCP servers in your applications with example code snippets.

v0.62.1

2025-06-19

Bug fixes and improvements (No major changes)

v0.61.3

2025-06-16

Support for Budget Limits in AI Gateway

You can now set a budget limit in your AI Gateway. It sets budgets to a specificed cost per day/month for requests. This will help you to control the cost of your GenAI applications. Read more

More updates:

New: added support for PagerDuty integration as Notification channel. Read more

New: added support to see diff in manifest with live manifest while applying using GitOps. Read more

v0.60.5

2025-06-12

Enhanced views and filters in AI Gateway Metrics

You can now group metrics with View By option to Model, User, Team or any custom metadata field.

More updates:

New: added support for Google ADK code snippet in AI Gateway Playground.

Support for JWT auth for non-exposed services as well

You can now deploy workloads in nodepools with additional taints - our system will automatically identify the required tolerations and add them to the workload. This will unblock custom use cases where you wish to deploy a workload on a specific os like Windows, etc.

v0.59.2

2025-06-06

Revamp AI Gateway Models access & management

You can now add and manage models in AI Gateway > Models tab
Older models present in Integration will also continue to work, but will be migrated to a new tab.
You can now select the following permissions for users at the Provider Account level:
- Manage Models - permission to add or edit models in the account
- Use Models - permission to use the models in the account
In integrations like OpenAI, AWS Bedrock, Vertex, Azure OpenAI, etc, you need only to configure the API Key at the account level, while for Azure Foundry, you can still configure the auth and endpoint per model within the same account
In Virtual Account, you can now give a complete provider account as well as individual models access.

More updates:

New: Cost for AI gateway models: We automatically populate publicly available cost for most of the models while keeping it flexible for you to add your custom value.

New: You can now switch between Assistant and User role while adding a message in AI Gateway Playground. This will help you experiment and save useful prompts.

New: We have introduced Alerts tab for all applications to allow you set alerts and notification channel for your application. You can choose our recommended alert rules or create your own custom alert rule using prometheus query. We have support for Slack and Email as notification channels and PagerDuty is coming soon. Read more

v0.57.1

2025-05-29

Revamped navigation sidebar with replacement of few tabs. ML Repo is now referred as Repository, Cluster and Integrations are part of Platform along with Git, Policy & Environment, while Virtual account, Users, Teams, & PAT are part of Access along with Audit logs.

More updates:

Improvements: Faster the image pull time for model server images by parallelizing the pull.
Improvements: Maintain idempotency on redeployments - when deploying a paused service, it will not resume the service.
Deprecating: We are deprecating trigger_on_deploy from Job manifest and will be removed in the future releases. To continue triggering job while deploying, you can pass --trigger-on-deploy flag in the tfy deploy command.
Insights & Recommendations: Storage cost saving with volume recommendation for Notebooks/SSH servers.

v0.56.0

2025-05-20

Enhanced UX for model selection and navigation in AI Gateway. You can now list all the models available in the organization and manage them from a single place.

More updates:

New: Support for multimodal inputs added in AI Gateway. You can now upload images and videos to the AI Gateway and use them in your prompts. Read more
Bug fixes & Improvements: Fixes in error handling for chat requests and return user-friendly error messages.

v0.55.2

2025-05-20

New Feature: Ease out process to setup standard policies using pre-defined templates

We’ve introduced pre-defined templates for standard policies. This makes it easier to setup policies for common use cases. Read more

More updates:

New: Access AI Gateway using Langchain class. Syntax available in code snippet.

v0.53.0

2025-05-08

New Feature: Support for Nvidia NIM LLM models and Nemo retriever

We’ve introduced support for deployment of Nvidia NIM LLM models and Nemo retriever.

More updates:

New: Access AI Gateway using Langchain class. Syntax available in code snippet.

v0.52.1

2025-05-06

New Feature: Latency Based Routing in AI Gateway

We’ve introduced a new feature in the AI Gateway that allows you to route requests based on latency. This is a great way to ensure that your requests are routed to the fastest model available, which can help improve the performance of your applications.

More updates:

New: Added support for Workflows and added username and previous manifest in the policy context. Read more
New: Export LLM Playground metrics to CSV. You can now export your LLM Playground metrics as a handy CSV file. It’s perfect for bringing into your favorite analytics tools and diving deeper into your data.

Deprecating: Changed in Kustomize patch in deployment spec. Read more

v0.50.1

2025-04-30

New Deployment Deletion Status

We’ve introduced a new status indicator to show when a deployment is in the process of being deleted. Once the deletion is complete, you’ll receive a notification confirming its success. If the deletion fails, the status will update to deletion_failed, along with a message explaining the reason for the failure.

More updates:

Support for non-OpenAI models from Azure on per API basis.
Removing mandate to pass a “valid” model name is request. You can now do full header based routing. [ just need to comply with openai standards]
Support for Responses API [Beta]. You can now use codium in your terminal via TrueFoundry LLM Gateway

v0.48.4

2025-04-16

Gateway Metrics Just Got a Big Upgrade! 🎉

We’ve given Gateway metrics a full makeover — and it’s better than ever! In addition to the metrics you already know and love, you can now explore brand new insights like Request Latency, Requests per Second (RPS), Time to First Token, and Inter-Token Latency. More visibility = more power!

But wait, there’s more! You now have dedicated views for User Activity and Model Usage, making it super easy to track input/output tokens, total requests, and total cost per user and model.

And yes — we’ve got your config metrics covered too! TrueFoundry Gateway continues to support settings like rate limits, fallbacks, and load balancing. Now, you can actually see which configs were applied to requests over time, complete with clear visuals and data.

In short: more clarity, better control, and a smoother experience! 🚀

v0.47.1

2025-04-11

Finetuning Just Got Smarter – Say Hello to Dora!

We’re excited to share that finetuning now supports Dora by default! Dora is a big step up from LoRA — offering improved performance that gets you much closer to full finetuning, without the extra complexity.It’s faster, smarter, and more efficient — what’s not to love?👉 Learn more about Dora and how to get started here.

LLM Deployment Just Got a Boost with TRT-LLM + NVIDIA Triton!

Great news — you can now deploy LLM models using the TRT-LLM backend with the NVIDIA Triton Inference Server!We’ve got you covered with prebuilt engines for Llama models, so you can get started right away. Have another model in mind? No problem — you can also request custom engines for any model of your choice!Faster inference, smoother deployment — let’s go! 💥

Improved Retry Information in Workflows

We’ve given the info panel a fresh new look—for both map and non-map tasks! Now, whenever a task gets retried, you can simply expand the row to dive into all the details. See the retry history, track the duration(for non-map tasks), and easily visualize logs, events, and metrics for each retry attempt.It’s all right there, just a click away—cleaner, clearer, and way more insightful!

v0.41.0

2025-03-21

Auto-pilot insights on cost savings and reliability fixes

Exciting update! 🎉 We’ve added two new metrics to help you easily spot the amazing impact of autopilot! Head over to the audit logs page, switch to the autopilot filter, and you’ll be able to see the total cost saved and the number of reliability issues fixed by autopilot in your chosen time range. How cool is that?Want to enable autopilot too? It’s super simple! Just go to your cluster details view to enable autopilot for all deployments within the cluster, or visit the specific deployment page to turn it on for individual deployments. Enjoy the benefits! 😊🚀And here’s a fun tidbit—on our internal dev setup, autopilot saved us an amazing $2590 and fixed 470 reliability issues!

Other Features, Fixes & Improvements:

Search functionality added to all integration select dropdowns
Moved readme tab to the beginning for SSH server details page for improved discoverability
Minor Bug fixes and UI/UX improvements

v0.40.0

2025-03-17

Setup Custom OAuth for Service Deployment

Great news! 🎉 TrueFoundry now makes it super easy to integrate custom JWT authentication, giving you the ability to set up JWT-based authentication using JWKS. Once you’ve added your custom integration, you’ll be able to use it right in your deployment settings for endpoint login and authentication. Plus, it supports a variety of providers like Google, Okta, AzureAD, Cognito, and many more — including TrueFoundry! 😊

Filter deployments by Auto-pilot Recommendations

This was a highly requested functionality. Now you will be able to filter all deployments based on if they have any recommendations by auto-pilot.

Support for fractional GPU in On-Prem clusters

This for our enterprise clients. If you setup on-prem clusters then now you can enjoy the benefits of fractional GPUs!

Other Features, Fixes & Improvements:

Added new menu items to simplify creation and updation of model versions
Minor Bug fixes and UI/UX improvements

v0.38.0

2025-03-10

Audit Logs

As an admin, you can now monitor all user and virtual account activity across the platform. You’ll be able to view the specific actions performed, such as “Create Deployment” or “Start Notebook,” along with detailed metadata, including the associated resource and the user who triggered each action.

Other Features, Fixes & Improvements:

Workflow node enhancement for map nodes – You can now view the status of underlying map tasks directly within the map node, without needing to open the detailed view.
On LLM Gateway v0.27.0, the rate limit config now supports virtualaccount instead of serviceaccount under subjects. To reduce the risk during the upgrade:
- Before upgrade, We can add new subects with virtualaccountwhere ever serivceaccount.
- Complete the upgrade
- After the upgrade, remove sericeaccount subjects.
Minor Bug fixes and UI/UX improvements

v0.35.0

2025-02-24

Automated Upgrades for Infrastructure Add-ons.

Staying up to date just got a whole lot easier. We’re rolling out automated upgrades for infrastructure add-on components like tfy-agent, argo, istio etc, ensuring your stack is always running the latest compatible versions —without the manual hassle. Whenever a new stable release drops, we’ll handle the upgrade seamlessly, so you can focus on shipping instead of maintaining.Of course, we’re not just throwing updates over the fence. Our system checks for compatibility before rolling anything out, minimising disruptions and keeping things smooth. You get the latest features and security patches without lifting a finger. Set it, forget it, and let your infra take care of itself.To enable this for your cluster simply visit the cluster drawer and enable “Auto-pilot” for add-ons.

Other Features, Fixes & Improvements:

Enhancements to autopilot: Added recommendations for CPU throttling, capacity type selection based on environment, and a volume size fix for Prometheus/Loki.
LLM Gateway: Introduced support for real-time models.
Minor Bug fixes and UI/UX improvements

v0.32.0

2025-02-14

New Features & Improvements:

Added functionality to edit configuration of workflows from UI
Support for reranker models in playground
Pagination support in teams
Added version field in manifest for artifact

Bug fixes:

Minor Bug fixes and UI/UX improvements

​Changes

​Release instructions:

​Improvements on Model deployment

​More updates:

​Release instructions:

​New: Security settings - enhanced control over personal access tokens

​Release instructions:

​Improved: New navigation sidebar

​More updates:

​Release instructions:

​New: AI Gateway playground chat history

​More updates:

​Release instructions:

​Improvement: Copy code snippet for AI Gateway with pre-filled token.

​Release instructions:

​New: Easily configure different settings for your organization

​More updates:

​New: Integrate and Use Guardrails in AI Gateway

​More updates:

​New: Logs Volume

​More updates:

​Improvements and Bug fixes

​Improvements and Bug fixes

​Improvements and Bug fixes

​New: Integrate and use MCP servers in AI Gateway

​Support for Budget Limits in AI Gateway

​More updates:

​Enhanced views and filters in AI Gateway Metrics

​More updates:

​Revamp AI Gateway Models access & management

​More updates:

​Revamp sidebar

​More updates:

​Enhanced Experience: Better UX for model selection and navigation in AI Gateway.

​More updates:

​New Feature: Ease out process to setup standard policies using pre-defined templates

​More updates:

​New Feature: Support for Nvidia NIM LLM models and Nemo retriever

​More updates:

​New Feature: Latency Based Routing in AI Gateway

​More updates:

​New Deployment Deletion Status

​More updates:

​Gateway Metrics Just Got a Big Upgrade! 🎉

​Finetuning Just Got Smarter – Say Hello to Dora!

​LLM Deployment Just Got a Boost with TRT-LLM + NVIDIA Triton!

​Improved Retry Information in Workflows

​Auto-pilot insights on cost savings and reliability fixes

​Other Features, Fixes & Improvements:

​Setup Custom OAuth for Service Deployment

​Filter deployments by Auto-pilot Recommendations

​Support for fractional GPU in On-Prem clusters

​Other Features, Fixes & Improvements:

​Audit Logs

​Other Features, Fixes & Improvements:

​Automated Upgrades for Infrastructure Add-ons.

​Other Features, Fixes & Improvements:

​New Features & Improvements:

​Bug fixes:

Changes

Release instructions:

Improvements on Model deployment

More updates:

Release instructions:

New: Security settings - enhanced control over personal access tokens

Release instructions:

Improved: New navigation sidebar

More updates:

Release instructions:

New: AI Gateway playground chat history

More updates:

Release instructions:

Improvement: Copy code snippet for AI Gateway with pre-filled token.

Release instructions:

New: Easily configure different settings for your organization

More updates:

New: Integrate and Use Guardrails in AI Gateway

More updates:

New: Logs Volume

More updates:

Improvements and Bug fixes

Improvements and Bug fixes

Improvements and Bug fixes

New: Integrate and use MCP servers in AI Gateway

Support for Budget Limits in AI Gateway

More updates:

Enhanced views and filters in AI Gateway Metrics

More updates:

Revamp AI Gateway Models access & management

More updates:

Revamp sidebar

More updates:

Enhanced Experience: Better UX for model selection and navigation in AI Gateway.

More updates:

New Feature: Ease out process to setup standard policies using pre-defined templates

More updates:

New Feature: Support for Nvidia NIM LLM models and Nemo retriever

More updates:

New Feature: Latency Based Routing in AI Gateway

More updates:

New Deployment Deletion Status

More updates:

Gateway Metrics Just Got a Big Upgrade! 🎉

Finetuning Just Got Smarter – Say Hello to Dora!

LLM Deployment Just Got a Boost with TRT-LLM + NVIDIA Triton!

Improved Retry Information in Workflows

Auto-pilot insights on cost savings and reliability fixes

Other Features, Fixes & Improvements:

Setup Custom OAuth for Service Deployment

Filter deployments by Auto-pilot Recommendations

Support for fractional GPU in On-Prem clusters

Other Features, Fixes & Improvements:

Audit Logs

Other Features, Fixes & Improvements:

Automated Upgrades for Infrastructure Add-ons.

Other Features, Fixes & Improvements:

New Features & Improvements:

Bug fixes: