v0.60.5
2025-06-09

Enhanced views and filters in AI Gateway Metrics

You can now group metrics with View By option to Model, User, Team or any custom metadata field.

More updates:

  • New: added support for Google ADK code snippet in AI Gateway Playground.
  • Support for JWT auth for non-exposed services as well
  • You can now deploy workloads in nodepools with additional taints - our system will automatically identify the required tolerations and add them to the workload. This will unblock custom use cases where you wish to deploy a workload on a specific os like Windows, etc.
v0.59.0
2025-06-06

Revamp AI Gateway Models access & management

  • You can now add and manage models in AI Gateway > Models tab

  • Older models present in Integration will also continue to work, but will be migrated to a new tab.

  • You can now select the following permissions for users at the Provider Account level:

    • Manage Models - permission to add or edit models in the account
    • Use Models - permission to use the models in the account
  • In integrations like OpenAI, AWS Bedrock, Vertex, Azure OpenAI, etc, you need only to configure the API Key at the account level, while for Azure Foundry, you can still configure the auth and endpoint per model within the same account

  • In Virtual Account, you can now give a complete provider account as well as individual models access.

More updates:

  • New: Cost for AI gateway models: We automatically populate publicly available cost for most of the models while keeping it flexible for you to add your custom value.
  • New: You can now switch between Assistant and User role while adding a message in AI Gateway Playground. This will help you experiment and save useful prompts.
  • New: We have introduced Alerts tab for all applications to allow you set alerts and notification channel for your application. You can choose our recommended alert rules or create your own custom alert rule using prometheus query. We have support for Slack and Email as notification channels and PagerDuty is coming soon. Read more
v0.57.1
2025-05-28

Revamp sidebar

Revamped navigation sidebar with replacement of few tabs. ML Repo is now referred as Repository, Cluster and Integrations are part of Platform along with Git, Policy & Environment, while Virtual account, Users, Teams, & PAT are part of Access along with Audit logs.

More updates:

  • Improvements: Faster the image pull time for model server images by parallelizing the pull.
  • Improvements: Maintain idempotency on redeployments - when deploying a paused service, it will not resume the service.
  • Deprecating: We are deprecating trigger_on_deploy from Job manifest and will be removed in the future releases. To continue triggering job while deploying, you can pass --trigger-on-deploy flag in the tfy deploy command.
  • Insights & Recommendations: Storage cost saving with volume recommendation for Notebooks/SSH servers.
v0.56.0
2025-05-16

Enhanced Experience: Better UX for model selection and navigation in AI Gateway.

Enhanced UX for model selection and navigation in AI Gateway. You can now list all the models available in the organization and manage them from a single place.

More updates:

  • New: Support for multimodal inputs added in AI Gateway. You can now upload images and videos to the AI Gateway and use them in your prompts. Read more
  • Bug fixes & Improvements: Fixes in error handling for chat requests and return user-friendly error messages.
v0.55.3
2025-05-13

New Feature: Ease out process to setup standard policies using pre-defined templates

We’ve introduced pre-defined templates for standard policies. This makes it easier to setup policies for common use cases. Read more

More updates:

  • New: Access AI Gateway using Langchain class. Syntax available in code snippet.
v0.53.0
2025-05-06

New Feature: Support for Nvidia NIM LLM models and Nemo retriever

We’ve introduced support for deployment of Nvidia NIM LLM models and Nemo retriever.

More updates:

  • New: Access AI Gateway using Langchain class. Syntax available in code snippet.
v0.52.1
2025-05-02

New Feature: Latency Based Routing in AI Gateway

We’ve introduced a new feature in the AI Gateway that allows you to route requests based on latency. This is a great way to ensure that your requests are routed to the fastest model available, which can help improve the performance of your applications.

More updates:

  • New: Added support for Workflows and added username and previous manifest in the policy context. Read more
  • Deprecating: Changed in Kustomize patch in deployment spec. Read more
v0.51.0
2025-04-30

New Feature: Export LLM Playground Metrics to CSV

Good news! You can now export your LLM Playground metrics as a handy CSV file. It’s perfect for bringing into your favorite analytics tools and diving deeper into your data. Happy analyzing!

More updates:

  • Fixed bug: The temperature parameter now supports decimal values in prompt schema
v0.50.1
2025-04-22

New Deployment Deletion Status

We’ve introduced a new status indicator to show when a deployment is in the process of being deleted. Once the deletion is complete, you’ll receive a notification confirming its success. If the deletion fails, the status will update to deletion_failed, along with a message explaining the reason for the failure.

More updates:

  • Support for non-OpenAI models from Azure on per API basis.
  • Removing mandate to pass a “valid” model name is request. You can now do full header based routing. [ just need to comply with openai standards]
  • Support for Responses API [Beta]. You can now use codium in your terminal via TrueFoundry LLM Gateway
v0.48.0
2025-04-17

Gateway Metrics Just Got a Big Upgrade! 🎉

We’ve given Gateway metrics a full makeover — and it’s better than ever! In addition to the metrics you already know and love, you can now explore brand new insights like Request Latency, Requests per Second (RPS), Time to First Token, and Inter-Token Latency. More visibility = more power!

But wait, there’s more! You now have dedicated views for User Activity and Model Usage, making it super easy to track input/output tokens, total requests, and total cost per user and model.

And yes — we’ve got your config metrics covered too! TrueFoundry Gateway continues to support settings like rate limits, fallbacks, and load balancing. Now, you can actually see which configs were applied to requests over time, complete with clear visuals and data.

In short: more clarity, better control, and a smoother experience! 🚀

v0.47.0
2025-04-10

Finetuning Just Got Smarter – Say Hello to Dora!

We’re excited to share that finetuning now supports Dora by default! Dora is a big step up from LoRA — offering improved performance that gets you much closer to full finetuning, without the extra complexity.

It’s faster, smarter, and more efficient — what’s not to love?

👉 Learn more about Dora and how to get started here.

LLM Deployment Just Got a Boost with TRT-LLM + NVIDIA Triton!

Great news — you can now deploy LLM models using the TRT-LLM backend with the NVIDIA Triton Inference Server!

We’ve got you covered with prebuilt engines for Llama models, so you can get started right away. Have another model in mind? No problem — you can also request custom engines for any model of your choice!

Faster inference, smoother deployment — let’s go! 💥

Improved Retry Information in Workflows

We’ve given the info panel a fresh new look—for both map and non-map tasks! Now, whenever a task gets retried, you can simply expand the row to dive into all the details. See the retry history, track the duration(for non-map tasks), and easily visualize logs, events, and metrics for each retry attempt.

It’s all right there, just a click away—cleaner, clearer, and way more insightful!

v0.41.0
2025-03-21

Auto-pilot insights on cost savings and reliability fixes

Exciting update! 🎉 We’ve added two new metrics to help you easily spot the amazing impact of autopilot! Head over to the audit logs page, switch to the autopilot filter, and you’ll be able to see the total cost saved and the number of reliability issues fixed by autopilot in your chosen time range. How cool is that?

Want to enable autopilot too? It’s super simple! Just go to your cluster details view to enable autopilot for all deployments within the cluster, or visit the specific deployment page to turn it on for individual deployments. Enjoy the benefits! 😊🚀

And here’s a fun tidbit—on our internal dev setup, autopilot saved us an amazing $2590 and fixed 470 reliability issues!

Other Features, Fixes & Improvements:

  • Search functionality added to all integration select dropdowns
  • Moved readme tab to the beginning for SSH server details page for improved discoverability
  • Minor Bug fixes and UI/UX improvements
v0.40.0
2025-03-17

Setup Custom OAuth for Service Deployment

Great news! 🎉 TrueFoundry now makes it super easy to integrate custom JWT authentication, giving you the ability to set up JWT-based authentication using JWKS. Once you’ve added your custom integration, you’ll be able to use it right in your deployment settings for endpoint login and authentication. Plus, it supports a variety of providers like Google, Okta, AzureAD, Cognito, and many more — including TrueFoundry! 😊

Filter deployments by Auto-pilot Recommendations

This was a highly requested functionality. Now you will be able to filter all deployments based on if they have any recommendations by auto-pilot.

Support for fractional GPU in On-Prem clusters

This for our enterprise clients. If you setup on-prem clusters then now you can enjoy the benefits of fractional GPUs!

Other Features, Fixes & Improvements:

  • Added new menu items to simplify creation and updation of model versions
  • Minor Bug fixes and UI/UX improvements
v0.38.0
2025-03-10

Audit Logs

As an admin, you can now monitor all user and virtual account activity across the platform. You’ll be able to view the specific actions performed, such as “Create Deployment” or “Start Notebook,” along with detailed metadata, including the associated resource and the user who triggered each action.

Other Features, Fixes & Improvements:

  • Workflow node enhancement for map nodes – You can now view the status of underlying map tasks directly within the map node, without needing to open the detailed view.

  • On LLM Gateway v0.27.0, the rate limit config now supports virtualaccount instead of serviceaccount under subjects. To reduce the risk during the upgrade:

    • Before upgrade, We can add new subects with virtualaccountwhere ever serivceaccount.
    • Complete the upgrade
    • After the upgrade, remove sericeaccount subjects.
  • Minor Bug fixes and UI/UX improvements

v0.35.0
2025-02-24

Automated Upgrades for Infrastructure Add-ons.

Staying up to date just got a whole lot easier. We’re rolling out automated upgrades for infrastructure add-on components like tfy-agent, argo, istio etc, ensuring your stack is always running the latest compatible versions —without the manual hassle. Whenever a new stable release drops, we’ll handle the upgrade seamlessly, so you can focus on shipping instead of maintaining.

Of course, we’re not just throwing updates over the fence. Our system checks for compatibility before rolling anything out, minimising disruptions and keeping things smooth. You get the latest features and security patches without lifting a finger. Set it, forget it, and let your infra take care of itself.

To enable this for your cluster simply visit the cluster drawer and enable “Auto-pilot” for add-ons.

Other Features, Fixes & Improvements:

  • Enhancements to autopilot: Added recommendations for CPU throttling, capacity type selection based on environment, and a volume size fix for Prometheus/Loki.
  • LLM Gateway: Introduced support for real-time models.
  • Minor Bug fixes and UI/UX improvements
v0.32.0
2025-02-14

New Features & Improvements:

  • Added functionality to edit configuration of workflows from UI
  • Support for reranker models in playground
  • Pagination support in teams
  • Added version field in manifest for artifact

Bug fixes:

  • Minor Bug fixes and UI/UX improvements