Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Headers
Optional metadata for the request
Body
Audio file to transcribe and transcription options
Identifier of the model to be used for generation.
Audio file to transcribe: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
Controls audio chunking. "auto" uses VAD to select boundaries.
auto
Additional info to include (e.g. logprobs for model confidence).
Input audio language in ISO-639-1 format (e.g. en).
Text to guide model style or continue a previous segment.
Output format: json, text, srt, verbose_json, or vtt. Default: json.
Sampling temperature (0-1). Higher values increase randomness. Default: 0.
Timestamp detail level: word or segment. Default: ["segment"].
Response
Audio transcribed successfully
The transcribed text.