Back to APIs

Unified Agent API

Sound Generation

Sound Generation

Generate sound effects, ambient audio, foley, and short instrumental music tracks from text descriptions. This is NOT text-to-speech: it does not produce spoken words. For voice/narration, use the tts capability instead.

POST/v1/sounds/generateSFX + music

Overview

Generate sound effects and short music tracks from text prompts (NOT speech: use tts for spoken voice)

Credits

4 credits per call

Providers

ElevenLabs

SDK Method

client.generate_sound(...)

Parameters

promptRequired

string

Description of the sound effect or music to generate. NOT for speech (use tts for spoken voice).

duration_seconds

number (default: 5)

Length of generated audio in seconds (0.5 to 22). Leave unset to let the model auto-decide based on the prompt.

prompt_influence

number (default: 0.3)

How strictly to follow the prompt (0.0 = creative, 1.0 = literal). Default 0.3.

output_format

select (default: mp3_44100_128)

Audio format and quality.

MP3 44.1kHz 128kbps
MP3 44.1kHz 192kbps
PCM 44.1kHz

Example Response

{
  "success": true,
  "data": {
    "audio_base64": "SUQzBAAAAAAAI1RTU0UAAAA...(base64 audio data)",
    "format": "mp3",
    "sample_rate": 44100,
    "duration_seconds": 2,
    "prompt": "short doorbell chime"
  },
  "metadata": {
    "provider_used": "elevenlabs",
    "providers_tried": [
      "elevenlabs"
    ],
    "mode_used": null,
    "response_time_ms": 1500,
    "request_id": "req_8bc6c1f9"
  },
  "credits_used": 4
}

Get Started

Use this API through the O-mega platform. Create an API key in your dashboard, then call the endpoint with your key in the Authorization header.

Try Sound Generation

Test Sound Generation in the interactive playground. No setup required.

Open Playground
Sound Generation API | Unified Agent APIs | suprsonic