Quick start

Enroll a voiceprint and verify a speaker in under 5 minutes.

1. Get your API key

Sign up at the dashboard with Google or email — your API keys are generated instantly. Keys are prefixed vx_live_ for production and vx_test_ for testing. You can create additional keys and revoke existing ones from the dashboard at any time.

2. Enroll a voiceprint

Send a voice sample (minimum 10 seconds, WAV/MP3/OGG) with a subject ID you define.

cURL
Python
Node.js
Go
curl https://api.vxid.dev/v1/subjects/enroll \
  -H "X-API-Key: vx_live_YOUR_KEY" \
  -F audio=@voice_sample.wav \
  -F subject_id="SUBJ-4821" \
  -F subject_name="Jane Doe"
import requests

resp = requests.post(
    "https://api.vxid.dev/v1/subjects/enroll",
    headers={"X-API-Key": "vx_live_YOUR_KEY"},
    files={"audio": open("voice_sample.wav", "rb")},
    data={
        "subject_id": "SUBJ-4821",
        "subject_name": "Jane Doe",
    },
)
print(resp.json())
const form = new FormData();
form.append("audio", fs.createReadStream("voice_sample.wav"));
form.append("subject_id", "SUBJ-4821");
form.append("subject_name", "Jane Doe");

const resp = await fetch("https://api.vxid.dev/v1/subjects/enroll", {
  method: "POST",
  headers: { "X-API-Key": "vx_live_YOUR_KEY" },
  body: form,
});
console.log(await resp.json());
body := &bytes.Buffer{}
w := multipart.NewWriter(body)
part, _ := w.CreateFormFile("audio", "voice_sample.wav")
io.Copy(part, audioFile)
w.WriteField("subject_id", "SUBJ-4821")
w.WriteField("subject_name", "Jane Doe")
w.Close()

req, _ := http.NewRequest("POST",
    "https://api.vxid.dev/v1/subjects/enroll", body)
req.Header.Set("X-API-Key", "vx_live_YOUR_KEY")
req.Header.Set("Content-Type", w.FormDataContentType())
resp, _ := http.DefaultClient.Do(req)

3. Verify the speaker

Send live audio with the same subject ID. VXID returns a verdict: PASS, FAIL, or REVIEW.

cURL
Python
Node.js
Go
curl https://api.vxid.dev/v1/subjects/verify \
  -H "X-API-Key: vx_live_YOUR_KEY" \
  -F audio=@live_audio.wav \
  -F subject_id="SUBJ-4821"
resp = requests.post(
    "https://api.vxid.dev/v1/subjects/verify",
    headers={"X-API-Key": "vx_live_YOUR_KEY"},
    files={"audio": open("live_audio.wav", "rb")},
    data={"subject_id": "SUBJ-4821"},
)
verdict = resp.json()["check"]["verdict"]
print(f"Verdict: {verdict}")
const form = new FormData();
form.append("audio", fs.createReadStream("live_audio.wav"));
form.append("subject_id", "SUBJ-4821");

const resp = await fetch("https://api.vxid.dev/v1/subjects/verify", {
  method: "POST",
  headers: { "X-API-Key": "vx_live_YOUR_KEY" },
  body: form,
});
const { check } = await resp.json();
console.log(`Verdict: ${check.verdict}`);
// Same multipart pattern as enroll
// POST to /v1/subjects/verify with audio + subject_id

4. Check the response

{
  "session_id": "sess_7f14a...",
  "subject_id": "SUBJ-4821",
  "check": {
    "check_id": "chk_9b2e...",
    "check_number": 1,
    "verdict": "PASS",
    "combined_score": 0.8545,
    "layers": [
      { "layer": "identity", "score": 0.87, "threshold": 0.65, "passed": true },
      { "layer": "authenticity", "score": 0.96, "threshold": 0.5, "passed": true },
      { "layer": "liveness", "score": 0.91, "threshold": 0.5, "passed": true }
    ],
    "audio_duration_sec": 10.0
  },
  "session_verdict": "PASS",
  "session_checks_count": 1
}
That's it. Two calls to verify any voice identity. Everything below is reference for building production integrations.

Authentication

Every API request requires an API key in the X-API-Key header.

X-API-Key: vx_live_k8xP2m9...

Getting your keys

Sign up at the dashboard with Google or email. Both a test and live key are generated automatically on signup. You can create additional keys, label them, and revoke them from the API Keys page in your dashboard.

PrefixEnvironmentBilling
vx_live_ProductionMetered — all usage is billed
vx_test_TestingFree — not billed, same API behavior

Keys are scoped to an organization. All data (subjects, sessions, usage) is isolated per organization via row-level security. Full keys are shown only once at creation — store them securely.

Never expose API keys in client-side code, Git repositories, or logs. Rotate keys immediately if compromised via the dashboard or by contacting support.

Errors & rate limits

HTTP status codes

CodeMeaning
200Success
400Bad request — invalid audio, missing fields, audio too short
401Unauthorized — missing or invalid API key
404Not found — subject or session doesn't exist
409Conflict — subject already enrolled (delete first to re-enroll)
429Rate limited — too many requests
500Server error — retry or contact support

Error response format

{
  "error": "Audio too short",
  "detail": "Audio duration 3.2s is below the 10.0s minimum for enrollment.",
  "code": "AUDIO_TOO_SHORT"
}

Rate limits

Default: 60 requests/minute per API key. Enterprise plans get custom limits. Rate limit headers are included in every response:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 58
X-RateLimit-Reset: 1716825600

Supported audio formats

VXID accepts any audio format that FFmpeg can decode. You do not need to pre-convert audio — send whatever your platform produces natively.

FormatExtensionTypical source
WAV (PCM).wavDesktop recording software, Python scripts
WebM / Opus.webm, .opusBrowser MediaRecorder API (Chrome, Firefox, Edge)
MP3.mp3General purpose, widely supported
OGG Vorbis.oggFirefox MediaRecorder, open-source tools
M4A / AAC.m4a, .aaciOS devices, Safari, Apple ecosystem
FLAC.flacLossless recording tools
AMR.amrTelephony systems, Android voice recorder
CAF.cafmacOS / iOS Core Audio
Raw PCM.rawDirect microphone capture, WebRTC streams
Recommended: 16kHz sample rate, mono channel, at least 10 seconds for enrollment and 2 seconds for verification. Higher quality audio produces more accurate voiceprints, but VXID handles resampling and channel conversion automatically.

Audio capture guide

Platform-specific examples for capturing audio and sending it to VXID.

Browser (JavaScript)

Use the MediaRecorder API to capture from the microphone. Browsers produce WebM/Opus by default — send it directly to the JSON endpoint.

// Request mic access and record 15 seconds
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const recorder = new MediaRecorder(stream, { mimeType: 'audio/webm;codecs=opus' });
const chunks = [];

recorder.ondataavailable = (e) => chunks.push(e.data);
recorder.onstop = async () => {
  stream.getTracks().forEach(t => t.stop());

  // Convert to base64
  const blob = new Blob(chunks, { type: 'audio/webm' });
  const base64 = await new Promise(resolve => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(blob);
  });

  // Send to VXID
  const resp = await fetch('https://api.vxid.dev/v1/subjects/enroll/json', {
    method: 'POST',
    headers: {
      'X-API-Key': 'vx_live_YOUR_KEY',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      audio_base64: base64,
      audio_format: 'webm',
      subject_id: 'SUBJ-4821',
    }),
  });
  console.log(await resp.json());
};

recorder.start();
setTimeout(() => recorder.stop(), 15000); // 15 seconds

Python

Use sounddevice for mic capture, or send any audio file directly.

import sounddevice as sd
import scipy.io.wavfile as wav
import requests

# Record 15 seconds from microphone
fs = 16000
duration = 15
print("Speak now...")
audio = sd.rec(int(duration * fs), samplerate=fs, channels=1, dtype='int16')
sd.wait()
wav.write("sample.wav", fs, audio)

# Enroll via file upload
resp = requests.post(
    "https://api.vxid.dev/v1/subjects/enroll",
    headers={"X-API-Key": "vx_live_YOUR_KEY"},
    files={"audio": open("sample.wav", "rb")},
    data={"subject_id": "SUBJ-4821"},
)
print(resp.json())

# Or send any existing audio file (MP3, M4A, OGG — anything)
resp = requests.post(
    "https://api.vxid.dev/v1/subjects/verify",
    headers={"X-API-Key": "vx_live_YOUR_KEY"},
    files={"audio": open("call_recording.mp3", "rb")},
    data={"subject_id": "SUBJ-4821"},
)

Node.js

Use node-record-lpcm16 for mic capture, or send files with FormData.

import fs from 'fs';

// Send any audio file — format doesn't matter
const form = new FormData();
form.append('audio', fs.createReadStream('recording.webm'));
form.append('subject_id', 'SUBJ-4821');

const resp = await fetch('https://api.vxid.dev/v1/subjects/enroll', {
  method: 'POST',
  headers: { 'X-API-Key': 'vx_live_YOUR_KEY' },
  body: form,
});

// Or use base64 for browser-captured audio
const audioBuffer = fs.readFileSync('recording.webm');
const resp2 = await fetch('https://api.vxid.dev/v1/subjects/enroll/json', {
  method: 'POST',
  headers: {
    'X-API-Key': 'vx_live_YOUR_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    audio_base64: audioBuffer.toString('base64'),
    audio_format: 'webm',
    subject_id: 'SUBJ-4821',
  }),
});

Mobile (React Native / Expo)

iOS produces M4A/AAC, Android produces OGG or AMR. Send either — VXID handles both.

import { Audio } from 'expo-av';
import * as FileSystem from 'expo-file-system';

// Record audio
const recording = new Audio.Recording();
await recording.prepareToRecordAsync(
  Audio.RecordingOptionsPresets.HIGH_QUALITY
);
await recording.startAsync();

// ... wait for recording ...

await recording.stopAndUnloadAsync();
const uri = recording.getURI(); // file:///...recording.m4a

// Read as base64 and send
const base64 = await FileSystem.readAsStringAsync(uri, {
  encoding: FileSystem.EncodingType.Base64,
});

const resp = await fetch('https://api.vxid.dev/v1/subjects/enroll/json', {
  method: 'POST',
  headers: {
    'X-API-Key': 'vx_live_YOUR_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    audio_base64: base64,
    audio_format: 'm4a',  // iOS default; use 'ogg' for Android
    subject_id: 'SUBJ-4821',
  }),
});

Telephony (Twilio / Exotel)

Call recording APIs provide audio URLs. Download and forward to VXID.

# Download Twilio recording and send to VXID
import requests

# Twilio provides recording URL
recording_url = "https://api.twilio.com/2010-04-01/Accounts/.../Recordings/RE.../Recording.mp3"
audio = requests.get(recording_url).content

# Send directly to VXID — MP3 works fine
resp = requests.post(
    "https://api.vxid.dev/v1/subjects/verify",
    headers={"X-API-Key": "vx_live_YOUR_KEY"},
    files={"audio": ("recording.mp3", audio, "audio/mpeg")},
    data={"subject_id": "SUBJ-4821"},
)
Key principle: VXID converts everything server-side. Your integration code never needs to install FFmpeg, handle audio codecs, or worry about sample rates. Capture audio in whatever format your platform provides, send it to VXID, and receive the verdict.

Enroll a subject

Enroll a subject's voiceprint. Upload an audio sample (minimum 10 seconds) and your reference ID. Enrollment is free — you're only billed for verifications.

Two endpoint variants — use whichever suits your platform:

Option A: File upload (multipart/form-data)

POST/v1/subjects/enroll

Best for: server-side integrations, CLI tools, file-based workflows.

FieldTypeDescription
audiofile requiredAny audio format: WAV, MP3, OGG, WebM, Opus, M4A, AAC, FLAC, AMR, CAF. Minimum 10 seconds.
subject_idstring requiredYour unique reference ID. Max 255 chars.
subject_namestring optionalDisplay name for dashboards and reports.
contextJSON string optionalIndustry-specific metadata. Stored and returned, never interpreted by VXID.

Option B: JSON with base64 audio

POST/v1/subjects/enroll/json

Best for: browsers (MediaRecorder), mobile apps, real-time systems.

{
  "audio_base64": "GkXfo59ChoEBQveBAo...",
  "audio_format": "webm",
  "subject_id": "SUBJ-4821",
  "subject_name": "Jane Doe",
  "context": {"department": "engineering"}
}
FieldTypeDescription
audio_base64string requiredBase64-encoded audio bytes.
audio_formatstring requiredFormat hint: wav, webm, opus, ogg, m4a, aac, mp3, flac, amr
subject_idstring requiredYour unique reference ID.
subject_namestring optionalDisplay name.
contextobject optionalIndustry-specific metadata.
You don't need to convert audio. VXID accepts any format and converts to 16kHz mono internally via FFmpeg. Send whatever your platform produces natively — WebM from browsers, M4A from iOS, AMR from telephony — and VXID handles the rest.

Response (both endpoints)

{
  "enrollment_id": "a1b2c3d4-...",
  "subject_id": "SUBJ-4821",
  "audio_duration_sec": 15.2,
  "embedding_dim": 192,
  "created_at": "2026-05-15T10:30:00Z",
  "message": "Voiceprint enrolled successfully"
}

Verify a speaker

Verify live audio against an enrolled voiceprint. Returns a verdict with per-layer scores. Pass session_id to add checks to an existing session for multi-check verification.

Option A: File upload (multipart/form-data)

POST/v1/subjects/verify
FieldTypeDescription
audiofile requiredAny audio format. Minimum 2 seconds.
subject_idstring requiredThe enrolled subject's reference ID.
session_idstring optionalExisting session ID for multi-check. Omit to create a new session.
contextJSON string optionalMetadata for this session.

Option B: JSON with base64 audio

POST/v1/subjects/verify/json
{
  "audio_base64": "GkXfo59ChoEBQveBAo...",
  "audio_format": "webm",
  "subject_id": "SUBJ-4821",
  "session_id": "sess_7f14...",
  "context": {"call_type": "interview"}
}

Response

See the quick start response above for the full structure. Key fields:

FieldDescription
check.verdictPASS, FAIL, or REVIEW for this individual check
check.layers[]Per-layer scores: identity, authenticity, liveness
session_verdictOverall session verdict (FAIL if any check fails)
session_idUse this to add subsequent checks to the same session

List subjects

GET/v1/subjects?page=1&per_page=20&search=jane

Paginated list of enrolled subjects. Searchable by subject_id or subject_name.

Get subject detail

GET/v1/subjects/SUBJ-4821

Returns enrollment details, session count, and last verification timestamp.

Delete subject

DEL/v1/subjects/SUBJ-4821

Soft-deletes the voiceprint and zeros out the embedding vector. For GDPR/DPDPA right-to-delete compliance. This action is irreversible — the subject must be re-enrolled.

Multi-check sessions

A session groups multiple verification checks into one logical interaction. For example, a 30-minute exam verified every 5 minutes creates 1 session with 6 checks.

How it works

First check: Call /v1/subjects/verify without session_id. A new session is created and the session_id is returned.

Subsequent checks: Pass the session_id from the first response. Each check is added to the same session. Scores, counters, and verdict are updated.

Session verdict: FAIL if any check fails. REVIEW if any check is borderline. PASS if all checks pass.

Complete the session: Call /v1/sessions/{id}/complete when done. Returns billable minutes.

You control how often to verify. VXID doesn't prescribe intervals — send audio whenever you decide. You're billed only for the audio you send.

List sessions

GET/v1/sessions?subject_id=SUBJ-4821&verdict=FAIL&status=completed

Filter by subject_id, verdict (PASS/FAIL/REVIEW), or status (active/completed).

Get session detail

GET/v1/sessions/sess_7f14a...

Returns the full session including all individual checks with per-check scores and timestamps.

Complete a session

POST/v1/sessions/sess_7f14a.../complete

Marks the session as completed. No more checks can be added. Returns final verdict and billable minutes.

{
  "session_id": "sess_7f14a...",
  "status": "completed",
  "verdict": "PASS",
  "checks_count": 6,
  "total_audio_sec": 60.0,
  "billable_minutes": 1.0
}

How billing works

VXID bills by audio processed, not by session time or API calls.

WhatPrice
EnrollmentFree
Verification audio$0.02 per minute of audio processed
Free tier100 minutes/month

You send a 10-second audio chunk → billed for 0.167 minutes → $0.0033. Your monthly bill = total audio minutes × $0.02. You control the total by choosing your verification interval.

Get usage

GET/v1/usage?period=2026-05

Returns audio seconds, billable minutes, estimated cost, and event counts for the billing period.

{
  "billing_period": "2026-05",
  "total_audio_seconds": 3600.0,
  "total_audio_minutes": 60.0,
  "billable_minutes": 60.0,
  "enrollments": 45,
  "verifications": 312,
  "estimated_cost_cents": 120
}

Choosing verification frequency

You decide how often to verify based on your risk tolerance. VXID is stateless — each check is independent. The table below is guidance, not prescription.

Risk levelIntervalExample use cases30-min session cost
CriticalEvery 10–30 secFinancial transactions, government ID$0.60
HighEvery 1–2 minInterviews, medical consultations$0.30–0.60
StandardEvery 5 minOnline exams, certifications$0.06
PeriodicEvery 10–15 minLong shift monitoring, extended calls$0.03–0.04
Pro tip: Verify more frequently at the start of a session (when proxy swaps are most likely) and reduce frequency over time. For a 2-hour exam: every 30 seconds for the first 5 minutes, then every 5 minutes for the rest. Total cost: ~$0.22 instead of $2.40 for continuous monitoring.

Questions? Reach us at [email protected]

© 2026 NeuralWeaves Technology Pvt Ltd