Supported Providers

  • OpenAI
  • Azure OpenAI
The speech API takes as input the text you want to convert to speech and generates high-quality audio using advanced text-to-speech models. All models support the same set of text inputs with a maximum length of 4,096 characters. On output, you can choose from various audio formats (mp3, opus, aac, flac, wav, pcm) and different voice options (alloy, echo, fable, onyx, nova, shimmer).

Code Snippet

from pathlib import Path
from openai import OpenAI

BASE_URL = "https://{controlPlaneUrl}/api/llm"
API_KEY = "your-truefoundry-api-key"

client = OpenAI(
    api_key=API_KEY,
    base_url=BASE_URL,
)

speech_file_path = Path(__file__).parent / "generated-audio.mp3"

with client.audio.speech.with_streaming_response.create(
  model="openai-main/gpt-4o-mini-tts",
  voice="alloy",
  input="hello how are you?"
) as response:
  response.stream_to_file(speech_file_path)
The API generates audio files in the specified format (defaults to MP3) with the selected voice characteristics.