Overview
Your agent converts text to audio. DashScope (QWEN3 TTS) is the primary provider with natural language voice selection. ElevenLabs is the fallback with a pre-defined voice catalog. Returns base64-encoded audio.
Parameters
text
Requiredstring
The text to convert to speech.
voice_description
string
Natural language description of the desired voice (DashScope). Ignored if using ElevenLabs.
voice_id
string
ElevenLabs voice ID. If provided, forces ElevenLabs provider.
provider
select (default: auto)
Force a specific TTS provider. Default: auto (DashScope first, ElevenLabs fallback).
Example Response
{
"success": true,
"data": {
"audio_base64": "UklGRi...(base64 audio data)...",
"format": "wav",
"sample_rate": 24000,
"text": "Hello, I am your AI assistant.",
"voice_description": "A calm, professional female voice"
},
"metadata": {
"provider_used": "dashscope",
"providers_tried": ["dashscope"],
"response_time_ms": 3200,
"request_id": "req_tts_001"
},
"credits_used": 2
}Get Started
Use this API through the O-mega platform. Create an API key in your dashboard, then call the endpoint with your key in the Authorization header.