Skip to main content

Translation management API

Control Palabra real-time translation pipeline over WebSockets or a WebRTC data channel.


1. Prerequisites

Before you connect, make sure you have the following:

  • webrtc_url – The URL of the Palabra WebRTC server.
  • ws_url – The URL of the Palabra WebSocket server.
  • publisher access token – A JWT used to authorize your connection.

All three values are returned when you create a streaming session.


2. Choose a transport

2.1 Option 1. WebRTC

  1. Connect with any LiveKit client to webrtc_url using your publisher token.
  2. Once the connection is open, you can start sending commands through the default (empty-topic) WebRTC data channel.

2.2 Option 2. WebSockets

Connect to ws_url, passing your publisher token as a query parameter:

// WebSocket control URL
const endpoint = `${ws_url}?token=${publisher}`;
const socket = new WebSocket(endpoint);

Once the connection is open, you can start sending commands.


3. Message format (WebRTC & WebSockets)

Every API packet—request and response has the same envelope:

{
"message_type": "<string>",
"data": { /* payload */ }
}

If message_type is "error", the data field contains diagnostic information.


4. Typical workflow

  1. Create task - send a set_task message type to start the translation.
  2. Update task – Send another set_task message to update the translation settings during an ongoing translation.
  3. Pause processing - send a pause_task message type to pause the translation (stops billing). Resume with another set_task.
  4. Finish task - send an end_task (the server will close your connection automatically; session will be invalidated in 1 minute).

5. Message Settings reference


6. Streaming audio configuration

6.1 Option 1. WebRTC audio I/O configuration

Use the following input/output streams configuration in your set_task:

{
"data": {
"input_stream": {
"content_type": "ws",
"source": {
"type": "webrtc"
}
},
"output_stream": {
"content_type": "ws",
"target": {
"type": "webrtc"
}
}
// ...
}
}
  1. Publish your microphone track to LiveKit Room.
  2. Subscribe to the translation tracks that Palabra will publish in the same LiveKit Room after you send the set_task message.

6.2 Option 2. WebSocket audio I/O configuration

Use this configuration instead:

{
"data": {
"input_stream": {
"content_type": "ws",
"source": {
"type": "ws",
"format": "opus", // or pcm_s16le, wav
"sample_rate": 24000, // 16000 - 24000
"channels": 1 // 1 or 2
}
},
"output_stream": {
"content_type": "ws",
"target": {
"type": "ws",
"format": "pcm_s16le" // or zlib_pcm_s16le
}
}
}
}
  1. Send base-64 audio chunks that exactly match the declared format.
  2. Receive base-64 TTS chunks in output_audio_data responses:
{
"message_type": "output_audio_data",
"data": {
"transcription_id": "190983855fe3404e",
"language": "es",
"last_chunk": false,
"data": "<base64-encoded audio>"
}
}

7. API messages

Request message schema

Loading ....

Response message schema

Loading ....

7.1 Requests (client → server)

Message typeShort description
set_taskCreate/update translation task
end taskFinish translation task
get_taskReturn current task
pause_taskPause current task, use set_task to continue
tts_taskGenerate TTS from text
input_audio_dataInput audio data chunk (Websockets audio transport only)

set_task

Create a new task or modify the current one.

  • Sending for the first time after creating a session - starts the translation.
  • Sending after pause_task - resumes the translation.
  • Sending another set_task message during an ongoing translation updates the current translation settings in real time—no need to stop the translation.
{
"message_type": "set_task",
"data": {
"input_stream": { /* Depending on transport, see the audio I/O section above */ },
"output_stream": { /* Depending on transport, see the audio I/O section above */ },
"pipeline": {
"transcription": {
"source_language": "string",
"detectable_languages": ["string"],
"segment_confirmation_silence_threshold": "float",
"only_confirm_by_silence": "bool",
"sentence_splitter": {
"enabled": "bool"
},
"verification": {
"auto_transcription_correction": "bool",
"transcription_correction_style": "string"
}
},
// Translation and speech generation settings for one or more target languages
"translations": [
{
"target_language": "string",
"translate_partial_transcriptions": "bool",
"speech_generation": {
"voice_cloning": "bool",
"voice_id": "string",
"voice_timbre_detection": {
"enabled": "bool",
"high_timbre_voices": [],
"low_timbre_voices": []
}
}
}
// You can add more targets
],
"translation_queue_configs": {
"global": {
"desired_queue_level_ms": "int",
"max_queue_level_ms": "int",
"auto_tempo": "bool"
},
"es": {
"desired_queue_level_ms": "int",
"max_queue_level_ms": "int"
}
},
// Select response types to receive
"allowed_message_types": [
"translated_transcription",
"partial_transcription",
"partial_translated_transcription",
"validated_transcription"
]
}
}
}

See Translation settings breakdown for details on each field and recommended settings of the translation's task pipeline.

ASR Only Mode

To use Palabra AI in ASR only mode, you can do either of the following:

  • Set the output_stream to null. (You will still get text translations, but no TTS audio.)
  • Use an empty translations list (You will not get text translations and no TTS audio.)

end_task

Finish the current task. The server closes the connection after receiving end_task.

{
"message_type": "end_task",
"data": { "force": false } // set true to skip finalization of the last phrase
}

pause_task

Pause the current task. No audio data is processed and no billing while the task is paused. Use set_task to resume translation.

{ "message_type": "pause_task", "data": {} }

get_task

Return the current task.

{ "message_type": "pause_task", "data": {} }

tts_task

Generates TTS from a text. It will be translated to all target_language in translations task section.

{
"message_type": "tts_task",
"data": {
"text": "Hello, how are you?",
"language": "en" // text language
}
}

input_audio_data

Used to send input base64 encoded audio data chunk whe Websockets selected as audio transport. The audio chunks you push must match the format / sample_rate / channels you declare in your set_task command. The optimal chunk length is 320ms.

{
"message_type": "input_audio_data",
"data": {
"data": "base64 encoded data"
}
}

7.2 Responses (server → client)

Message typeShort description
partial_transcriptionUnconfirmed ASR segment
partial_translated_transcriptionUnconfirmed translation segment
validated_transcriptionFinal ASR segment
translated_transcriptionFinal translation
output_audio_dataChunk of generated TTS audio (WebSockets audio transport only)
current_taskget_task command response
errorValidation or runtime error
  • To receive partial_transcription, validated_transcription, and translated_transcription messages, you must include these message types in the allowed_message_types field of your set_task command.

  • To receive partial_translated_transcription messages, you must must include it in the allowed_message_types field AND set translate_partial_transcriptions to true in your set_task command.


partial_transcription

Uncompleted segment transcription:

{
"message_type": "partial_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "en",
"text": "One, two"
}
}
}

partial_translated_transcription

Uncompleted segment translation.

{
"message_type": "translated_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "es",
"text": "Um, dois,"
}
}
}

validated_transcription

Completed segment transcription.

{
"message_type": "validated_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "en",
"text": "One, two, three, four, five."
}
}
}

translated_transcription

Completed segment translation:

{
"message_type": "translated_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "es",
"text": "Um, dois, três, quatro, cinco."
}
}
}

output_audio_data

TTS audio chunk (if you use Websockets as audio transport).

{
"message_type": "output_audio_data",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "es", // TTS language
"last_chunk": false, // Last generated chunk for this `transcription_id`
"data": "base64 string"
}
}
}

current_task

The get_task command response.

{
"message_type": "error",
"data": {
"code": "VALIDATION_ERROR",
"desc": "ValidationError(model='SetTaskMessage', errors=[{'loc': ('input_stream', 'content_type')",
"msg": "value is not a valid enumeration member; permitted: 'audio'\", 'type': 'type_error.enum'",
"param": null
}
}

error:

Validation, authorization or other kinds of errors.

{
"message_type": "error",
"data": {
"code": "VALIDATION_ERROR",
"desc": "ValidationError(model='SetTaskMessage', errors=[{'loc': ('input_stream', 'content_type')",
"param": null
}
}