Speech to Text API

Provider capabilities
Add models to the gateway
Code snippets
Response

Provider capabilities

The table below summarizes gateway support for this endpoint by provider.

Legend:

✅ Supported by Provider and Truefoundry
Provided by provider, but not by Truefoundry
Provider does not support this feature

Provider	Transcription
OpenAI	✅
Azure OpenAI	✅
Azure AI Foundry	✅
Anthropic
Bedrock
Vertex
Cohere
Gemini
Groq	✅
Together-AI
xAI
DeepInfra
DeepGram	✅
Cartesia	✅
ElevenLabs	✅
Smallest AI	✅

For every gateway endpoint and provider, see Supported APIs. Speech to text (STT) turns speech (audio files) into text. The gateway supports two ways to call it:

Approach	Use for	Base path
OpenAI-compatible API	OpenAI, Azure OpenAI, Azure AI Foundry, Groq	`{GATEWAY_BASE_URL}`
Provider proxy (native SDK / HTTP)	Deepgram, Cartesia, ElevenLabs, Vertex, Smallest AI	`{GATEWAY_BASE_URL}/stt/{providerAccountName}`

Add models to the gateway

Before you can call the Speech to Text API, add your STT models to TrueFoundry through a provider account. When adding a model, select Audio Transcription as the model type.

Provider	Setup guide
OpenAI	OpenAI
Azure OpenAI	Azure OpenAI
Azure AI Foundry	Azure AI Foundry
Groq	Groq
Deepgram	Deepgram
Cartesia	Cartesia
ElevenLabs	ElevenLabs
Vertex	Google Vertex
Smallest AI	Smallest AI

Code snippets

Before you start: Replace {GATEWAY_BASE_URL} with your AI Gateway Base URL (how to find it) and your-tfy-api-key with your TrueFoundry API key. For the provider proxy, replace {providerAccountName} with the display name of your provider account on TrueFoundry.

Model names: For audio (STT/TTS), the model ID in code must match the display name of the model on your TrueFoundry provider account.

Which SDK to use: For OpenAI, Azure OpenAI, Azure AI Foundry, and Groq, use the OpenAI SDK (same API). For Deepgram, Cartesia, ElevenLabs, and Vertex, use each provider’s native SDK with the gateway URL above. For Smallest AI, call the Waves HTTP API directly.

from openai import OpenAI

BASE_URL = "{GATEWAY_BASE_URL}"
API_KEY = "your-tfy-api-key"

client = OpenAI(
    api_key=API_KEY,
    base_url=BASE_URL,
)

with open("/path/to/audio.mp3", "rb") as audio_file:
    response = client.audio.transcriptions.create(
        model="openai-main/whisper-1",  # truefoundry model name
        file=audio_file,
    )

print(response)

Response

The shape of the response depends on the provider. Use print(response) to inspect it, or refer to each provider’s SDK docs for the exact structure.

Audio Translation Live / Realtime API

⌘I

Get Started

LLM Gateway

MCP Registry and Gateway

Agent Registry

Skills Registry

Guardrails and Security

Prompt Management

Observability

Deployment

Admin Guide

Chat

Agent

Embeddings

Rerank

Responses

Image

Audio

Batch

Files

Moderations

Models

Provider capabilities

Add models to the gateway

Code snippets

Response

Get Started

LLM Gateway

MCP Registry and Gateway

Agent Registry

Skills Registry

Guardrails and Security

Prompt Management

Observability

Deployment

Admin Guide

Chat

Agent

Embeddings

Rerank

Responses

Image

Audio

Batch

Files

Moderations

Models

Documentation Index

​Provider capabilities

​Add models to the gateway

​Code snippets

​Response

Provider capabilities

Add models to the gateway

Code snippets

Response