An embedding is a sequence of numbers (a vector) that represents the semantic meaning of content such as natural language, code, or other structured data. These embeddings allow machine learning models to understand the relationships and similarity between different pieces of content.

They are widely used in:

  • Clustering
  • Semantic search and retrieval
  • Recommendation engines
  • Retrieval-Augmented Generation (RAG)
  • Knowledge retrieval systems in ChatGPT and the Assistants API

Code Snippet

from openai import OpenAI

BASE_URL = "https://{controlPlaneUrl}/api/llm"
API_KEY = "your-truefoundry-api-key"

# Configure OpenAI client with TrueFoundry settings
client = OpenAI(
    api_key=API_KEY,
    base_url=BASE_URL,
)

response = client.embeddings.create(
    model="openai-main/text-embedding-3-small",
    input="TrueFoundry is amazing!"
)

print(response.data[0].embedding)

Expected Output

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [
        -0.006929283495992422,
        -0.005336422007530928,
        -4.547132266452536e-05,
        -0.024047505110502243
      ],
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 5,
    "total_tokens": 5
  }
}

Notes for Cohere Models

When using Cohere models via the embeddings API, you must include an additional field called input_type in the request. This field indicates the purpose of the embedding and must be one of the following:

  • search_query
  • search_document
  • classification
  • clustering
response = client.embeddings.create(
    model="cohere-main/embed-english-v3.0",
    input="Find similar documents about AI.",
    input_type="search_query"
)