Clone Voice with API

This guide walks you through the process of cloning a voice using Palabra's API.
Voice cloning via API allows you to programmatically create a voice that replicates a specific speaker by submitting a short audio sample and related metadata.

Step 1: Get API credentials

Log in to your Palabra account
Go to the Palabra API section
Create a new API key or use an existing one
Copy your Client ID and Client Secret — you'll need them to authenticate requests

API credentials

Step 2: Prepare your audio sample

To ensure high-quality voice cloning, please follow the guidelines below when uploading your sample:

Accepted formats: MP3, WAV, FLAC, WEBM, MP4, MPEG, or MPG
Maximum file size: 10 MB
Minimum duration: 30 seconds
Audio quality: No background noise
Speaker requirement: Only one speaker per sample
Input types: Audio or video files are accepted

Step 3: Create the voice cloning request

Voice cloning through the API is performed in two steps:

Step 1: Submit voice cloning metadata

First, send a POST request to create a voice cloning task with the metadata of your audio sample.
At this stage, you do not upload the audio file itself — only its metadata is submitted.

Endpoint

https://api.palabra.ai/saas/voice/clone

Sample payload

{
  "name": "My voice",
  "samples": [
    {
      "filename": "20250611_1453_Recording.mp3",
      "mime_type": "audio/mpeg",
      "display_name": "My voice",
      "description": "Description of my voice",
      "denoise": false,
      "lang_code": "en",
      "speech_normalization": true
    }
  ],
  "description": "Description of my voice",
  "labels": {
    "gender": null,
    "age_group": null,
    "mood": null
  }
}

Field descriptions

`name` (required)

A user-defined name for the cloned voice. This will be used to identify the voice in your Palabra account.

`samples` (required)

An array of one or more audio samples with metadata for each file.

filename (required)
The original filename of the uploaded sample.
mime_type (required)
MIME type of the file (e.g., audio/mpeg, audio/wav).
display_name (optional)
Human-readable name to display in the UI.
description (optional)
Additional information about the sample.
speech_normalization (optional)
Whether to apply automatic speech normalization (true or false). Default is true.
denoise (optional)
Whether to apply automatic denoising (true or false). Default is false.
lang_code (required)
Language code of the speaker (e.g., en, uk). Used to optimize voice modeling.

`description` (optional)

A description of the cloned voice for internal reference.

`labels` (optional)

Optional metadata describing the speaker:

gender – One of male, female, or null.
age_group – One of kid, adult, senior, or null.
mood – One of neutral, happy, angry, sad, or null.

Note: The name and lang_code fields are required. All other fields are optional but recommended for better accuracy and organization.

Example: Voice cloning request

JavaScript

const payload = {
  name: "My voice",
  samples: [
    {
      filename: "20250611_1453_Recording.mp3",
      mime_type: "audio/mpeg",
      display_name: "My voice",
      description: "Description of my voice",
      denoise: false,
      lang_code: "en"
    }
  ],
  description: "Description of my voice",
  labels: {
    gender: null,
    age_group: null,
    mood: null
  }
};

const response = await fetch('https://api.palabra.ai/saas/voice/clone', {
  method: 'POST',
  headers: {
    'ClientId': '<YOUR_CLIENT_ID>',
    'ClientSecret': '<YOUR_CLIENT_SECRET>'
  },
  body: JSON.stringify(payload)
});

if (!response.ok) {
  const errorText = await response.text().catch(() => response.statusText);
  throw new Error(`Failed to clone voice: ${response.status} ${errorText}`);
}

Response

{
    "utc_created_at": "2025-06-19T10:52:53.893244",
    "voice_id": "10545719-5dfb-4164-9b39-cc70ed2ff97d",
    "user_id": "02117a4f-a847-4264-9807-704d279bbf3a",
    "name": "My voice",
    "voice_type": "instantly_cloned",
    "processing_status": "created",
    "description": "My voice",
    "labels": {
        "gender": null,
        "age_group": null,
        "mood": null
    },
    "lang_code": "en",
    "samples": [
        {
            "item_id": "0",
            "blob_id": "7e8344fc-4408-4ef7-942b-45d641b2877e",
            "url": "https://palabra-prod-web-cdn.s3.amazonaws.com/",
            "form_data": {
                "acl": "private",
                "bucket": "palabra-prod-web-cdn",
                "key": "blob/author/instant_voice_clone_upload_input_sample/02117a4f-a847-4264-9807-704d279bbf3a/7e8344fc-4408-4ef7-942b-45d641b2877e.mp3",
                "x-amz-meta-blob-id": "7e8344fc-4408-4ef7-942b-45d641b2877e",
                "x-amz-meta-filename": "20250611_1453_Recording.mp3",
                "Content-Type": "audio/mpeg",
                "x-amz-meta-user-id": "02117a4f-a847-4264-9807-704d279bbf3a",
                "x-amz-meta-intent": "instant_voice_clone_upload_input_sample",
                "x-amz-meta-voice-id": "10545719-5dfb-4164-9b39-cc70ed2ff97d",
                "x-amz-meta-upload-id": "10545719-5dfb-4164-9b39-cc70ed2ff97d",
                "x-amz-algorithm": "AWS4-HMAC-SHA256",
                "x-amz-credential": "AKIAR3HUOH7XJLBFCRWH/20250619/eu-central-1/s3/aws4_request",
                "x-amz-date": "20250619T105253Z",
                "policy": "eyJleHBpcmF0aW9uIjogIjIwMjUtMDYtMTlUMTE6MDc6NTNaIiwgImNvbmRpdGlvbnMiOiBbeyJhY2wiOiAicHJpdmF0ZSJ9LCB7ImJ1Y2tldCI6ICJwYWxhYnJhLXByb2Qtd2ViLWNkbiJ9LCB7ImtleSI6ICJibG9iL2F1dGhvci9pbnN0YW50X3ZvaWNlX2Nsb25lX3VwbG9hZF9pbnB1dF9zYW1wbGUvMDIxMTdhNGYtYTg0Ny00MjY0LTk4MDctNzA0ZDI3OWJiZjNhLzdlODM0NGZjLTQ0MDgtNGVmNy05NDJiLTQ1ZDY0MWIyODc3ZS5tcDMifSwgeyJ4LWFtei1tZXRhLWJsb2ItaWQiOiAiN2U4MzQ0ZmMtNDQwOC00ZWY3LTk0MmItNDVkNjQxYjI4NzdlIn0sIHsieC1hbXotbWV0YS1maWxlbmFtZSI6ICIyMDI1MDYxMV8xNDUzX1JlY29yZGluZy5tcDMifSwgeyJDb250ZW50LVR5cGUiOiAiYXVkaW8vbXBlZyJ9LCB7IngtYW16LW1ldGEtdXNlci1pZCI6ICIwMjExN2E0Zi1hODQ3LTQyNjQtOTgwNy03MDRkMjc5YmJmM2EifSwgeyJ4LWFtei1tZXRhLWludGVudCI6ICJpbnN0YW50X3ZvaWNlX2Nsb25lX3VwbG9hZF9pbnB1dF9zYW1wbGUifSwgeyJ4LWFtei1tZXRhLXZvaWNlLWlkIjogIjEwNTQ1NzE5LTVkZmItNDE2NC05YjM5LWNjNzBlZDJmZjk3ZCJ9LCB7IngtYW16LW1ldGEtdXBsb2FkLWlkIjogIjEwNTQ1NzE5LTVkZmItNDE2NC05YjM5LWNjNzBlZDJmZjk3ZCJ9LCBbImNvbnRlbnQtbGVuZ3RoLXJhbmdlIiwgMTA0ODUsIDMzNTU0NDMyXSwgeyJidWNrZXQiOiAicGFsYWJyYS1wcm9kLXdlYi1jZG4ifSwgeyJrZXkiOiAiYmxvYi9hdXRob3IvaW5zdGFudF92b2ljZV9jbG9uZV91cGxvYWRfaW5wdXRfc2FtcGxlLzAyMTE3YTRmLWE4NDctNDI2NC05ODA3LTcwNGQyNzliYmYzYS83ZTgzNDRmYy00NDA4LTRlZjctOTQyYi00NWQ2NDFiMjg3N2UubXAzIn0sIHsieC1hbXotYWxnb3JpdGhtIjogIkFXUzQtSE1BQy1TSEEyNTYifSwgeyJ4LWFtei1jcmVkZW50aWFsIjogIkFLSUFSM0hVT0g3WEpMQkZDUldILzIwMjUwNjE5L2V1LWNlbnRyYWwtMS9zMy9hd3M0X3JlcXVlc3QifSwgeyJ4LWFtei1kYXRlIjogIjIwMjUwNjE5VDEwNTI1M1oifV19",
                "x-amz-signature": "b8e3c8607d7b9c66ce92da2f8a0a4b2dbd3578fdb87f671569f5208b343c7a22"
            }
        }
    ]
}

Step 2: Upload the audio file

Use the url and samples fields returned in Step 1 to upload your audio file via POST.

Example: Upload request

JavaScript

async function uploadFile(sample, file) {
  const formData = new FormData();

  for (const [key, value] of Object.entries(sample.form_data)) {
    formData.append(key, value);
  }

  formData.append('file', file, file.name);

  const response = await fetch(sample.url, {
    method: 'POST',
    body: formData,
    headers: {
      'ClientId': '<YOUR_CLIENT_ID>',
      'ClientSecret': '<YOUR_CLIENT_SECRET>'
    },
  });

  if (!response.ok) {
    let errorText;
    try {
      errorText = await response.text();
    } catch {
      errorText = response.statusText;
    }
    throw new Error(`Failed to upload file: ${response.status} ${errorText}`);
  }
}

Once the file is successfully uploaded, the system will automatically begin processing the sample. You can track the status of the voice cloning task via the https://api.palabra.ai/saas/voice/m/${id} endpoint.

Step 1: Get API credentials​

Step 2: Prepare your audio sample​

Step 3: Create the voice cloning request​

Step 1: Submit voice cloning metadata​

Endpoint​

Sample payload​

Field descriptions​

name (required)​

samples (required)​

description (optional)​

labels (optional)​

Example: Voice cloning request​

Response​

Step 2: Upload the audio file​

Example: Upload request​