Skip to main content

Realtime ASR API

WebSockets or WebRTC DataChannel can be used to receive caption messages. Follow API connection flow, which is described here.

Refer to the recommended settings to set exact option values in the examples below.

Captions only

Omit or set to null output_stream and translations fields in translation settings.
All input_stream and transcription options are supported. See the example set_task command structure below:

{
"input_stream": {/*...*/},
"output_stream": null, // set to `null` or omit the field
"pipeline": {
"transcription": {/*...*/},
"translations": [], // set to an empty list or omit the field
"allowed_message_types": [
// you will only receive messages of these types
"partial_transcription",
"validated_transcription"
]
}
}

Captions with translation

Omit or set to null speech_generation field in translation settings.
All standard task options are supported — see the recommended settings. See the example set_task command structure below:

{
"input_stream": {/*...*/},
"output_stream": null, // set to `null` or omit the field
"pipeline": {
"transcription": {/*...*/},
"translations": [{/*...*/}, {/*...*/}], // translation settings for each language, set `speech_generation` to null or omit the field
"allowed_message_types": [
// you will only receive messages of these types
"partial_transcription",
"validated_transcription",
"translated_transcription"
]
}
}

Multiple target languages

If multiple target languages are set in translations you will receive a separate translated_transcription for each one.