Text-to-Speech and Text-to-Translated-Speech.

You can enable TTS only or TTS with translation modes by adjusting the task options.
Depending on your output_stream setting, use WebSockets or WebRTC DataChannel to receive TTS data. Follow the general API connection flow, which is described here.

Refer to the recommended settings to set exact option values in the examples below.

TTS task command

The tts_task command allows you to generate speech from a text with optional translation capabilities. Send a WebSocket/WebRTC DataChannel message with the similar structure:

{
  "message_type": "tts_task",
  "data": {
    "text": "Hello, how are you?",
    "language": "en",
    "translate_text": false
  }
}

Required fields

Field	Type	Description	Constraints
`text`	string	Text to generate speech from	Max length: 2048 characters
`language`	string	Language of the text	See language requirements below

Optional fields

Field	Type	Default	Description
`translate_text`	boolean	`false`	Enable translation before TTS

Text-to-Translated-Speech

Set translate_text to true in tts_task. All output_stream and translations options are supported. See the example set_task command structure below:

{
  "input_stream": null, // set to `null` or omit the field
  "output_stream": {/*...*/}, 
  "pipeline": {
    "transcription": null, // set to `null` or omit the field
    "translations": [{/*...*/}, {/*...*/}], // set translation and speech generation options for each language
    "allowed_message_types": [
      // you will only receive translated text and `output_audio_data` messages when (when using WS transport)
      "translated_transcription"
    ]
  }
}

`tts_task` language field requirements

language must be one of the allowed languages

Multiple target languages support

When multiple languages are configured in translations section, the system will return translations and TTS for all configured languages simultaneously.

Text-to-Speech

Set translate_text to false in tts_task or omit this field. All output_stream and translations options are supported. All standard task options are supported — see the recommended settings.

{
  "input_stream": null, // set to `null` or omit the field
  "output_stream": {/*...*/},
  "pipeline": {
    "transcription": null, // set to `null`
    "translations": [{/*...*/}, {/*...*/}], // set speech generation options for each language
    "allowed_message_types": [] // you will receive only `output_audio_data` messages (when using WS transport)
  }
}

`tts_task` language field requirements

language must be one of the allowed languages
language in tts_task must be equal to one target_language in translations to set speech settings (will be changed in future versions)

Multiple target languages support

You can set options for multiple languages in translations section, but a separate tts_task message with different corresponding language is required for each.

TTS task command

Required fields​

Optional fields​

Text-to-Translated-Speech​

tts_task language field requirements​

Multiple target languages support​

Text-to-Speech​

tts_task language field requirements​

Multiple target languages support​

Required fields

Optional fields

Text-to-Translated-Speech

`tts_task` language field requirements

Multiple target languages support

Text-to-Speech

`tts_task` language field requirements

Multiple target languages support