Truefoundry Docs

On this page

Prerequisites
Code Examples
Chat Completion
Text Embedding
FAQs

LlamaIndex is an open-source data framework for building applications with large language models (LLMs). TrueFoundry seamlessly integrates with LlamaIndex, allowing you to route all LLM requests through the TrueFoundry Gateway for enhanced security, load balancing, cost management, and more. This guide will walk you through connecting LlamaIndex with TrueFoundrry.

You should only use the OpenAILike and OpenAIEmbedding classes from LlamaIndex. These classes are designed for custom OpenAI-compatible endpoints, allowing you to use your TrueFoundry-specific model names directly. Using the standard OpenAI class will cause errors, as it validates model names against OpenAI’s public list.

Prerequisites

Before you begin, ensure you have the following:

Authentication Token: A TrueFoundry API key. Follow the instructions in Generating Tokens to create one.
Gateway Base URL: Your TrueFoundry Gateway URL, which looks like https://<control_plane_url>/api/llm/api/inference/openai.

Code Examples

The following examples demonstrate how to use LlamaIndex with TrueFoundry.

Chat Completion

This example shows how to perform chat completion.

from llama_index.core import Settings
from llama_index.llms.openai_like import OpenAILike

# Configure the LLM
llm = OpenAILike(
    model="tfy_model_name",
    api_key="tfy_api_key",
    api_base="https://<control_plane_url>/api/llm/api/inference/openai",
    is_chat_model=True
)

# Set the LLM for LlamaIndex to use globally
Settings.llm = llm

# Perform a chat completion
response = llm.complete("What is the capital of France?")
print(response.text)

Text Embedding

This example shows how to generate text embeddings.

from llama_index.core import Settings
from llama_index.embeddings.openai import OpenAIEmbedding

# Configure the LLM
embedding_model = OpenAIEmbedding(
    model_name="tfy_model_name",
    api_key="tfy_api_key",
    api_base="https://<your-host>/api/llm/api/inference/openai",
)

# Set the embedding model for LlamaIndex to use globally
Settings.embed_model = embedding_model

# Generate a text embedding
response = embedding_model.get_text_embedding("This is a sample text.")
print(f"Embedding length: {len(response)}")

FAQs

Getting 'Unknown model, Please provide a valid OpenAI model name' error?

Get Started

Developer Guide

MCP Registry and Gateway

Observability

Integrations

Deployment

API Reference

Chat

Agent

MCP

Embeddings

Rerank

Responses

Image

Audio

Batch

Files

Moderations

Models

Integration with Llama Index

Prerequisites

Code Examples

Chat Completion

Text Embedding

FAQs

Get Started

Developer Guide

MCP Registry and Gateway

Observability

Integrations

Deployment

API Reference

Chat

Agent

MCP

Embeddings

Rerank

Responses

Image

Audio

Batch

Files

Moderations

Models

​Prerequisites

​Code Examples

​Chat Completion

​Text Embedding

​FAQs

Prerequisites

Code Examples

Chat Completion

Text Embedding

FAQs