Clone Voice with API
This guide walks you through the process of cloning a voice using Palabra's API.
Voice cloning via API allows you to programmatically create a voice that replicates a specific speaker by submitting a short audio sample and related metadata.
Step 1: Get API credentials
- Log in to your Palabra account
- Go to the Palabra API section
- Create a new API key or use an existing one
- Copy your
Client ID
andClient Secret
— you'll need them to authenticate requests
Step 2: Prepare your audio sample
To ensure high-quality voice cloning, please follow the guidelines below when uploading your sample:
- Accepted formats: MP3, WAV, FLAC, WEBM, MP4, MPEG, or MPG
- Maximum file size: 10 MB
- Minimum duration: 30 seconds
- Audio quality: No background noise
- Speaker requirement: Only one speaker per sample
- Input types: Audio or video files are accepted
Step 3: Create the voice cloning request
Voice cloning through the API is performed in two steps:
Step 1: Submit voice cloning metadata
First, send a POST
request to create a voice cloning task with the metadata of your audio sample.
At this stage, you do not upload the audio file itself — only its metadata is submitted.
Endpoint
https://api.palabra.ai/saas/voice/clone
Sample payload
{
"name": "My voice",
"samples": [
{
"filename": "20250611_1453_Recording.mp3",
"mime_type": "audio/mpeg",
"display_name": "My voice",
"description": "Description of my voice",
"denoise": false,
"lang_code": "en",
"speech_normalization": true
}
],
"description": "Description of my voice",
"labels": {
"gender": null,
"age_group": null,
"mood": null
}
}
Field descriptions
name
(required)
A user-defined name for the cloned voice. This will be used to identify the voice in your Palabra account.
samples
(required)
An array of one or more audio samples with metadata for each file.
-
filename
(required)
The original filename of the uploaded sample. -
mime_type
(required)
MIME type of the file (e.g.,audio/mpeg
,audio/wav
). -
display_name
(optional)
Human-readable name to display in the UI. -
description
(optional)
Additional information about the sample. -
speech_normalization
(optional)
Whether to apply automatic speech normalization (true
orfalse
). Default istrue
. -
denoise
(optional)
Whether to apply automatic denoising (true
orfalse
). Default isfalse
. -
lang_code
(required)
Language code of the speaker (e.g.,en
,uk
). Used to optimize voice modeling.
description
(optional)
A description of the cloned voice for internal reference.
labels
(optional)
Optional metadata describing the speaker:
gender
– One ofmale
,female
, ornull
.age_group
– One ofkid
,adult
,senior
, ornull
.mood
– One ofneutral
,happy
,angry
,sad
, ornull
.
Note: The
name
andlang_code
fields are required. All other fields are optional but recommended for better accuracy and organization.
Example: Voice cloning request
- JavaScript
const payload = {
name: "My voice",
samples: [
{
filename: "20250611_1453_Recording.mp3",
mime_type: "audio/mpeg",
display_name: "My voice",
description: "Description of my voice",
denoise: false,
lang_code: "en"
}
],
description: "Description of my voice",
labels: {
gender: null,
age_group: null,
mood: null
}
};
const response = await fetch('https://api.palabra.ai/saas/voice/clone', {
method: 'POST',
headers: {
'ClientId': '<YOUR_CLIENT_ID>',
'ClientSecret': '<YOUR_CLIENT_SECRET>'
},
body: JSON.stringify(payload)
});
if (!response.ok) {
const errorText = await response.text().catch(() => response.statusText);
throw new Error(`Failed to clone voice: ${response.status} ${errorText}`);
}
Response
{
"utc_created_at": "2025-06-19T10:52:53.893244",
"voice_id": "10545719-5dfb-4164-9b39-cc70ed2ff97d",
"user_id": "02117a4f-a847-4264-9807-704d279bbf3a",
"name": "My voice",
"voice_type": "instantly_cloned",
"processing_status": "created",
"description": "My voice",
"labels": {
"gender": null,
"age_group": null,
"mood": null
},
"lang_code": "en",
"samples": [
{
"item_id": "0",
"blob_id": "7e8344fc-4408-4ef7-942b-45d641b2877e",
"url": "https://palabra-prod-web-cdn.s3.amazonaws.com/",
"form_data": {
"acl": "private",
"bucket": "palabra-prod-web-cdn",
"key": "blob/author/instant_voice_clone_upload_input_sample/02117a4f-a847-4264-9807-704d279bbf3a/7e8344fc-4408-4ef7-942b-45d641b2877e.mp3",
"x-amz-meta-blob-id": "7e8344fc-4408-4ef7-942b-45d641b2877e",
"x-amz-meta-filename": "20250611_1453_Recording.mp3",
"Content-Type": "audio/mpeg",
"x-amz-meta-user-id": "02117a4f-a847-4264-9807-704d279bbf3a",
"x-amz-meta-intent": "instant_voice_clone_upload_input_sample",
"x-amz-meta-voice-id": "10545719-5dfb-4164-9b39-cc70ed2ff97d",
"x-amz-meta-upload-id": "10545719-5dfb-4164-9b39-cc70ed2ff97d",
"x-amz-algorithm": "AWS4-HMAC-SHA256",
"x-amz-credential": "AKIAR3HUOH7XJLBFCRWH/20250619/eu-central-1/s3/aws4_request",
"x-amz-date": "20250619T105253Z",
"policy": "eyJleHBpcmF0aW9uIjogIjIwMjUtMDYtMTlUMTE6MDc6NTNaIiwgImNvbmRpdGlvbnMiOiBbeyJhY2wiOiAicHJpdmF0ZSJ9LCB7ImJ1Y2tldCI6ICJwYWxhYnJhLXByb2Qtd2ViLWNkbiJ9LCB7ImtleSI6ICJibG9iL2F1dGhvci9pbnN0YW50X3ZvaWNlX2Nsb25lX3VwbG9hZF9pbnB1dF9zYW1wbGUvMDIxMTdhNGYtYTg0Ny00MjY0LTk4MDctNzA0ZDI3OWJiZjNhLzdlODM0NGZjLTQ0MDgtNGVmNy05NDJiLTQ1ZDY0MWIyODc3ZS5tcDMifSwgeyJ4LWFtei1tZXRhLWJsb2ItaWQiOiAiN2U4MzQ0ZmMtNDQwOC00ZWY3LTk0MmItNDVkNjQxYjI4NzdlIn0sIHsieC1hbXotbWV0YS1maWxlbmFtZSI6ICIyMDI1MDYxMV8xNDUzX1JlY29yZGluZy5tcDMifSwgeyJDb250ZW50LVR5cGUiOiAiYXVkaW8vbXBlZyJ9LCB7IngtYW16LW1ldGEtdXNlci1pZCI6ICIwMjExN2E0Zi1hODQ3LTQyNjQtOTgwNy03MDRkMjc5YmJmM2EifSwgeyJ4LWFtei1tZXRhLWludGVudCI6ICJpbnN0YW50X3ZvaWNlX2Nsb25lX3VwbG9hZF9pbnB1dF9zYW1wbGUifSwgeyJ4LWFtei1tZXRhLXZvaWNlLWlkIjogIjEwNTQ1NzE5LTVkZmItNDE2NC05YjM5LWNjNzBlZDJmZjk3ZCJ9LCB7IngtYW16LW1ldGEtdXBsb2FkLWlkIjogIjEwNTQ1NzE5LTVkZmItNDE2NC05YjM5LWNjNzBlZDJmZjk3ZCJ9LCBbImNvbnRlbnQtbGVuZ3RoLXJhbmdlIiwgMTA0ODUsIDMzNTU0NDMyXSwgeyJidWNrZXQiOiAicGFsYWJyYS1wcm9kLXdlYi1jZG4ifSwgeyJrZXkiOiAiYmxvYi9hdXRob3IvaW5zdGFudF92b2ljZV9jbG9uZV91cGxvYWRfaW5wdXRfc2FtcGxlLzAyMTE3YTRmLWE4NDctNDI2NC05ODA3LTcwNGQyNzliYmYzYS83ZTgzNDRmYy00NDA4LTRlZjctOTQyYi00NWQ2NDFiMjg3N2UubXAzIn0sIHsieC1hbXotYWxnb3JpdGhtIjogIkFXUzQtSE1BQy1TSEEyNTYifSwgeyJ4LWFtei1jcmVkZW50aWFsIjogIkFLSUFSM0hVT0g3WEpMQkZDUldILzIwMjUwNjE5L2V1LWNlbnRyYWwtMS9zMy9hd3M0X3JlcXVlc3QifSwgeyJ4LWFtei1kYXRlIjogIjIwMjUwNjE5VDEwNTI1M1oifV19",
"x-amz-signature": "b8e3c8607d7b9c66ce92da2f8a0a4b2dbd3578fdb87f671569f5208b343c7a22"
}
}
]
}
Step 2: Upload the audio file
Use the url
and samples
fields returned in Step 1 to upload your audio file via POST
.
Example: Upload request
- JavaScript
async function uploadFile(sample, file) {
const formData = new FormData();
for (const [key, value] of Object.entries(sample.form_data)) {
formData.append(key, value);
}
formData.append('file', file, file.name);
const response = await fetch(sample.url, {
method: 'POST',
body: formData,
headers: {
'ClientId': '<YOUR_CLIENT_ID>',
'ClientSecret': '<YOUR_CLIENT_SECRET>'
},
});
if (!response.ok) {
let errorText;
try {
errorText = await response.text();
} catch {
errorText = response.statusText;
}
throw new Error(`Failed to upload file: ${response.status} ${errorText}`);
}
}
Once the file is successfully uploaded, the system will automatically begin processing the sample.
You can track the status of the voice cloning task via the https://api.palabra.ai/saas/voice/m/${id}
endpoint.