Overview
Your agent transcribes pre-recorded audio to text via Deepgram (nova-2 model). Provide a URL to the audio file or base64-encoded audio data. Returns transcript with confidence scores and word-level timing. For real-time streaming, use the WebSocket API.
Parameters
audio_url
string
URL to audio file. Deepgram fetches it directly.
audio_base64
string
Base64-encoded audio data. Use this for local files.
language
string (default: en)
Language code for transcription.
Example Response
{
"success": true,
"data": {
"transcript": "Hello, this is a test recording.",
"confidence": 0.98,
"language": "en",
"words": [
{"word": "Hello", "start": 0.0, "end": 0.5, "confidence": 0.99}
],
"duration_seconds": 3.2
},
"metadata": {
"provider_used": "deepgram",
"providers_tried": ["deepgram"],
"response_time_ms": 1200,
"request_id": "req_stt_001"
},
"credits_used": 2
}Get Started
Use this API through the O-mega platform. Create an API key in your dashboard, then call the endpoint with your key in the Authorization header.