Skip to main content

Text-to-Speech and Text-to-Translated-Speech.

You can enable TTS only or TTS with translation modes by adjusting the task options.
Depending on your output_stream setting, use WebSockets or WebRTC DataChannel to receive TTS data. Follow the general API connection flow, which is described here.

Refer to the recommended settings to set exact option values in the examples below.

TTS task command

The tts_task command allows you to generate speech from a text with optional translation capabilities. Send a WebSocket/WebRTC DataChannel message with the similar structure:

{
"message_type": "tts_task",
"data": {
"text": "Hello, how are you?",
"language": "en",
"translate_text": false
}
}

Required fields

FieldTypeDescriptionConstraints
textstringText to generate speech fromMax length: 2048 characters
languagestringLanguage of the textSee language requirements below

Optional fields

FieldTypeDefaultDescription
translate_textbooleanfalseEnable translation before TTS

Text-to-Translated-Speech

Set translate_text to true in tts_task. All output_stream and translations options are supported. See the example set_task command structure below:

{
"input_stream": null, // set to `null`
"output_stream": {/*...*/},
"pipeline": {
"transcription": null, // set to `null`
"translations": [{/*...*/}, {/*...*/}], // set translation and speech generation options for each language
"allowed_message_types": [
// you will only receive translated text and `output_audio_data` messages
"translated_transcription"
]
}
}

tts_task language field requirements

Multiple target languages support

When multiple languages are configured in translations section, the system will return translations and TTS for all configured languages simultaneously.

Text-to-Speech

Set translate_text to false in tts_task or omit this field. All output_stream and translations options are supported. All standard task options are supported — see the recommended settings.

{
"input_stream": null, // set to `null`
"output_stream": {/*...*/},
"pipeline": {
"transcription": null, // set to `null`
"translations": [{/*...*/}, {/*...*/}], // set speech generation options for each language
"allowed_message_types": [] // you will `output_audio_data` messages
}
}

tts_task language field requirements

  • language must be one of the allowed languages
  • language in tts_task must be equal to one target_language in translations to set speech settings (will be changed in future versions)

Multiple target languages support

You can set options for multiple languages in translations section, but a separate tts_task message with different corresponding language is required for each.