Translation management WebSocket API
The WebSocket API is used for controlling translation in real-time and receiving text transcriptions.
Prerequisites
- Get the publisher’s
access_token
andcontrol_url
by creating a streaming session.
Connection
Connect to the Palabra WebSocket Server by control_url
endpoint using your access_token
value as a token
GET parameter:
// Palabra WebSocket endpoint
endpoint = "{control_url}?token={access_token}"
WebSocket configuration messages
Send WebSocket task
messages to configure your translation pipeline, specifying the source and target languages, along with any other necessary settings.
Request/Response Structure
All request and response messages have the same structure. They include two fields, message_type
and data
.
The error message has the text "error" in its message_type
field.
{
"message_type": string,
"data": dict
}
General use
- Send a
set_task
message to create a processing task. You can edit the task by sending additionalset_task
messages after the first one. - Send an
end_task
message to gracefully stop the task. - Send a
pause_task
to stop audio processing without deleting the task.
Settings description
For detailed settings description, refer to our translation settings breakdown section For optimal settings values, refer to our recommended settings section.
Request message schema
Response message schema
Examples
Request chain example:
-
Send a
set_task
message to create a task.{
"message_type": "set_task",
"data": {
// Input audio stream settings
"input_stream": {
"content_type": "string",
"source": {
"type": "string",
}
},
// Translated audio stream settings
"output_stream": {
"content_type": "string",
"target": {
"type": "string",
}
},
"pipeline": {
// Preprocessing settings
"preprocessing": {
"enable_vad": "bool",
"vad_threshold": "float",
"vad_left_padding": "int",
"vad_right_padding": "int",
"pre_vad_denoise": "bool",
"pre_vad_dsp": "bool"
},
// ASR settings
"transcription": {
"source_language": "string",
"detectable_languages": ["string"],
"asr_model": "string",
"denoise": "string",
"allow_hotwords_glossaries": "bool",
"suppress_numeral_tokens": "bool",
"diarize_speakers": "bool",
"priority": "string",
"min_alignment_score": "float",
"max_alignment_cer": "float",
"segment_confirmation_silence_threshold": "float",
"only_confirm_by_silence": "bool",
"batched_inference": "bool",
"force_detect_language": "bool",
"sentence_splitter": {
"enabled": "bool",
"splitter_model": "string"
},
"verification": {
"verification_model": "string",
"allow_verification_glossaries": "bool",
"auto_transcription_correction": "bool",
"transcription_correction_style": "null or string"
}
},
// Translation and speech generation settings for one or more target languages
"translations": [
{
"target_language": "string",
"allowed_source_languages": ["string"],
"translation_model": "string",
"allow_translation_glossaries": "bool",
"style": "null or string",
"speech_generation": {
"tts_model": "string",
"voice_cloning": "bool",
"voice_id": "null or string",
"voice_timbre_detection": {
"enabled": "bool",
"high_timbre_voices": ["string"],
"low_timbre_voices": ["string"]
},
"denoise_voice_samples": "bool",
"speech_tempo_auto": "bool",
"speech_tempo_timings_factor": "float",
"speech_tempo_adjustment_factor": "float",
}
},
{
// Settings for an additional language (one more audio track will be published in WebRTC)
"target_language": "string",
"allowed_source_languages": ["string"],
"translation_model": "string",
"allow_translation_glossaries": "bool",
"style": null,
"speech_generation": {
"tts_model": "string",
"voice_cloning": "bool",
"voice_id": "null or string",
"voice_timbre_detection": {
"enabled": "bool",
"high_timbre_voices": ["string"],
"low_timbre_voices": ["string"]
},
"denoise_voice_samples": "bool",
"speech_tempo_auto": "bool",
"speech_tempo_timings_factor": "float",
"speech_tempo_adjustment_factor": "float",
}
}
],
// TTS buffer settings
"translation_queue_configs": {
"global": { // global setting
"desired_queue_level_ms": "int",
"max_queue_level_ms": "int"
},
"es": { // language overwrite
"desired_queue_level_ms": "int",
"max_queue_level_ms": "int"
}
},
// Allowed WS messages
"allowed_message_types": [
"partial_transcription",
"validated_transcription",
"translated_transcription"
]
}
}
}Note: If you want only ASR (Automatic Speech Recognition) without speech generation, you have two options:
- Set output_stream to null. In this case, you will still receive translations, but there will be no text-to-speech (TTS).
- Provide an empty list for translations. This will result in neither translations nor TTS being sent.
-
Send an
end_task
message to finish the task:{
"message_type": "end_task",
"data": {
"force": false // Do not wait for the last segments
}
}
Response message examples:
- A
partial_transcription
uncompleted segment transcription:{
"message_type": "partial_transcription",
"data": {
"transcription": {
"transcription_id": "19615fa2e341df9c",
"language": "en",
"text": "One, two, three, four, five...",
"segments": [
{
"text": "One, two, three, four, five...",
"start": 0.33999999999999986,
"end": 1.58,
"start_timestamp": 1744125438.080009,
"end_timestamp": 1744125439.040023
}
]
}
}
} - A
validated_transcription
complete segment transcription:{
"message_type": "validated_transcription",
"data": {
"transcription": {
"transcription_id": "19615fa2e341df9c",
"language": "en",
"text": "One, two, three, four, five, six, seven.",
"segments": [
{
"text": "One, two, three, four, five, six, seven.",
"start": 0,
"end": 2,
"start_timestamp": 1744125437.760025,
"end_timestamp": 1744125439.679354
}
]
}
}
} - A
translated_transcription
complete segment translation:{
"message_type": "translated_transcription",
"data": {
"transcription": {
"transcription_id": "19615fa2e341df9c",
"language": "es",
"text": "Uno, dos, tres, cuatro, cinco, seis, siete. ",
"segments": [
{
"text": "Uno, dos, tres, cuatro, cinco, seis, siete. ",
"start": 0,
"end": 2,
"start_timestamp": 1744125437.760025,
"end_timestamp": 1744125439.679354
}
]
}
}
}
Error message examples:
- A
validation error
:{
"message_type": "error",
"data": {
"code": "VALIDATION_ERROR",
"desc": "ValidationError(model='SetTaskMessage', errors=[{'loc': ('input_stream', 'content_type'), 'msg': \"value is not a valid enumeration member; permitted: 'audio'\", 'type': 'type_error.enum', 'ctx': {'enum_values': [<StreamContentType.audio: 'audio'>]}}])",
"param": null
}
}