Text-to-Translated-Speech
The tts_task command generates speech from text, with optional translation. Depending on your output_stream setting, the TTS data is delivered over WebSockets or the WebRTC data channel. Follow the general API connection flow described here.
Refer to the Recommended settings for exact option values in the examples below.
If you only need text-to-speech (without translation), use the dedicated Realtime TTS API instead.
The tts_task command
Send a WebSocket / WebRTC data channel message with the following structure:
{
"message_type": "tts_task",
"data": {
"text": "Hello, how are you?",
"language": "en",
"translate_text": true
}
}
Required fields
| Field | Type | Description | Constraints |
|---|---|---|---|
text | string | Text to generate speech from | Max length: 2048 characters |
language | string | Language of the text | Must be one of the supported languages |
Optional fields
| Field | Type | Default | Description |
|---|---|---|---|
translate_text | boolean | false | Enable translation before TTS |
Text-to-Translated-Speech
Set translate_text to true in tts_task. The text will be translated into every target_language configured in the translations section, and TTS will be generated for each of them.
All output_stream and translations options are supported. Use the following set_task command structure:
{
"input_stream": null, // set to `null` or omit the field
"output_stream": {/*...*/},
"pipeline": {
"transcription": null, // set to `null` or omit the field
"translations": [{/*...*/}, {/*...*/}], // translation and speech generation options for each language
"allowed_message_types": [
// you will only receive translated text and `output_audio_data` messages (when using WS transport)
"translated_transcription"
]
}
}
Multiple target languages
When multiple languages are configured in the translations section, the system returns translations and TTS for all configured languages simultaneously.
Text-to-Speech without translation (deprecated)
For plain text-to-speech, use the dedicated Realtime TTS API instead.
Set translate_text to false in tts_task or omit the field. All standard task options are supported — see the Recommended settings.
{
"input_stream": null, // set to `null` or omit the field
"output_stream": {/*...*/},
"pipeline": {
"transcription": null, // set to `null`
"translations": [{/*...*/}, {/*...*/}], // speech generation options for each language
"allowed_message_types": [] // you will receive only `output_audio_data` messages (when using WS transport)
}
}
Requirements in this mode:
languagemust be one of the supported languages.languageintts_taskmust equal one of thetarget_languagevalues intranslationsto apply speech settings (this will change in future versions).- You can configure multiple languages in the
translationssection, but a separatetts_taskmessage with the correspondinglanguageis required for each one.