Quick start

Enroll a voiceprint and verify a speaker in under 5 minutes.

1. Get your API key

Sign up at the dashboard with Google or email — your API keys are generated instantly. Keys are prefixed vx_live_ for production and vx_test_ for testing. You can create additional keys and revoke existing ones from the dashboard at any time.

2. Enroll a voiceprint

Send a voice sample (minimum 10 seconds, WAV/MP3/OGG) with a subject ID you define.

cURL

Python

Node.js

curl https://api.vxid.dev/v1/subjects/enroll \
  -H "X-API-Key: vx_live_YOUR_KEY" \
  -F audio=@voice_sample.wav \
  -F subject_id="SUBJ-4821" \
  -F subject_name="Jane Doe"

import requests

resp = requests.post(
    "https://api.vxid.dev/v1/subjects/enroll",
    headers={"X-API-Key": "vx_live_YOUR_KEY"},
    files={"audio": open("voice_sample.wav", "rb")},
    data={
        "subject_id": "SUBJ-4821",
        "subject_name": "Jane Doe",
    },
)
print(resp.json())

const form = new FormData();
form.append("audio", fs.createReadStream("voice_sample.wav"));
form.append("subject_id", "SUBJ-4821");
form.append("subject_name", "Jane Doe");

const resp = await fetch("https://api.vxid.dev/v1/subjects/enroll", {
  method: "POST",
  headers: { "X-API-Key": "vx_live_YOUR_KEY" },
  body: form,
});
console.log(await resp.json());

body := &bytes.Buffer{}
w := multipart.NewWriter(body)
part, _ := w.CreateFormFile("audio", "voice_sample.wav")
io.Copy(part, audioFile)
w.WriteField("subject_id", "SUBJ-4821")
w.WriteField("subject_name", "Jane Doe")
w.Close()

req, _ := http.NewRequest("POST",
    "https://api.vxid.dev/v1/subjects/enroll", body)
req.Header.Set("X-API-Key", "vx_live_YOUR_KEY")
req.Header.Set("Content-Type", w.FormDataContentType())
resp, _ := http.DefaultClient.Do(req)

3. Verify the speaker

Send live audio with the same subject ID. VXID returns a verdict: PASS, FAIL, or REVIEW.

cURL

Python

Node.js

curl https://api.vxid.dev/v1/subjects/verify \
  -H "X-API-Key: vx_live_YOUR_KEY" \
  -F audio=@live_audio.wav \
  -F subject_id="SUBJ-4821"

resp = requests.post(
    "https://api.vxid.dev/v1/subjects/verify",
    headers={"X-API-Key": "vx_live_YOUR_KEY"},
    files={"audio": open("live_audio.wav", "rb")},
    data={"subject_id": "SUBJ-4821"},
)
verdict = resp.json()["check"]["verdict"]
print(f"Verdict: {verdict}")

const form = new FormData();
form.append("audio", fs.createReadStream("live_audio.wav"));
form.append("subject_id", "SUBJ-4821");

const resp = await fetch("https://api.vxid.dev/v1/subjects/verify", {
  method: "POST",
  headers: { "X-API-Key": "vx_live_YOUR_KEY" },
  body: form,
});
const { check } = await resp.json();
console.log(`Verdict: ${check.verdict}`);

// Same multipart pattern as enroll
// POST to /v1/subjects/verify with audio + subject_id

4. Check the response

{
  "session_id": "sess_7f14a...",
  "subject_id": "SUBJ-4821",
  "check": {
    "check_id": "chk_9b2e...",
    "check_number": 1,
    "verdict": "PASS",
    "combined_score": 0.8545,
    "layers": [
      { "layer": "identity", "score": 0.87, "threshold": 0.65, "passed": true },
      { "layer": "authenticity", "score": 0.96, "threshold": 0.5, "passed": true },
      { "layer": "liveness", "score": 0.91, "threshold": 0.5, "passed": true }
    ],
    "audio_duration_sec": 10.0
  },
  "session_verdict": "PASS",
  "session_checks_count": 1
}

✓

That's it. Two calls to verify any voice identity. Everything below is reference for building production integrations.

Authentication

Every API request requires an API key in the X-API-Key header.

X-API-Key: vx_live_k8xP2m9...

Getting your keys

Sign up at the dashboard with Google or email. Both a test and live key are generated automatically on signup. You can create additional keys, label them, and revoke them from the API Keys page in your dashboard.

Prefix	Environment	Billing
vx_live_	Production	Metered — all usage is billed
vx_test_	Testing	Free — not billed, same API behavior

Keys are scoped to an organization. All data (subjects, sessions, usage) is isolated per organization via row-level security. Full keys are shown only once at creation — store them securely.

⚠

Never expose API keys in client-side code, Git repositories, or logs. Rotate keys immediately if compromised via the dashboard or by contacting support.

Errors & rate limits

HTTP status codes

Code	Meaning
200	Success
400	Bad request — invalid audio, missing fields, audio too short
401	Unauthorized — missing or invalid API key
404	Not found — subject or session doesn't exist
409	Conflict — subject already enrolled (delete first to re-enroll)
429	Rate limited — too many requests
500	Server error — retry or contact support

Error response format

{
  "error": "Audio too short",
  "detail": "Audio duration 3.2s is below the 10.0s minimum for enrollment.",
  "code": "AUDIO_TOO_SHORT"
}

Rate limits

Default: 60 requests/minute per API key. Enterprise plans get custom limits. Rate limit headers are included in every response:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 58
X-RateLimit-Reset: 1716825600

Supported audio formats

VXID accepts any audio format that FFmpeg can decode. You do not need to pre-convert audio — send whatever your platform produces natively.

Format	Extension	Typical source
WAV (PCM)	.wav	Desktop recording software, Python scripts
WebM / Opus	.webm, .opus	Browser MediaRecorder API (Chrome, Firefox, Edge)
MP3	.mp3	General purpose, widely supported
OGG Vorbis	.ogg	Firefox MediaRecorder, open-source tools
M4A / AAC	.m4a, .aac	iOS devices, Safari, Apple ecosystem
FLAC	.flac	Lossless recording tools
AMR	.amr	Telephony systems, Android voice recorder
CAF	.caf	macOS / iOS Core Audio
Raw PCM	.raw	Direct microphone capture, WebRTC streams

★

Recommended: 16kHz sample rate, mono channel, at least 10 seconds for enrollment and 2 seconds for verification. Higher quality audio produces more accurate voiceprints, but VXID handles resampling and channel conversion automatically.

Audio capture guide

Platform-specific examples for capturing audio and sending it to VXID.

Browser (JavaScript)

Use the MediaRecorder API to capture from the microphone. Browsers produce WebM/Opus by default — send it directly to the JSON endpoint.

// Request mic access and record 15 seconds
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const recorder = new MediaRecorder(stream, { mimeType: 'audio/webm;codecs=opus' });
const chunks = [];

recorder.ondataavailable = (e) => chunks.push(e.data);
recorder.onstop = async () => {
  stream.getTracks().forEach(t => t.stop());

  // Convert to base64
  const blob = new Blob(chunks, { type: 'audio/webm' });
  const base64 = await new Promise(resolve => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(blob);
  });

  // Send to VXID
  const resp = await fetch('https://api.vxid.dev/v1/subjects/enroll/json', {
    method: 'POST',
    headers: {
      'X-API-Key': 'vx_live_YOUR_KEY',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      audio_base64: base64,
      audio_format: 'webm',
      subject_id: 'SUBJ-4821',
    }),
  });
  console.log(await resp.json());
};

recorder.start();
setTimeout(() => recorder.stop(), 15000); // 15 seconds

Python

Use sounddevice for mic capture, or send any audio file directly.

import sounddevice as sd
import scipy.io.wavfile as wav
import requests

# Record 15 seconds from microphone
fs = 16000
duration = 15
print("Speak now...")
audio = sd.rec(int(duration * fs), samplerate=fs, channels=1, dtype='int16')
sd.wait()
wav.write("sample.wav", fs, audio)

# Enroll via file upload
resp = requests.post(
    "https://api.vxid.dev/v1/subjects/enroll",
    headers={"X-API-Key": "vx_live_YOUR_KEY"},
    files={"audio": open("sample.wav", "rb")},
    data={"subject_id": "SUBJ-4821"},
)
print(resp.json())

# Or send any existing audio file (MP3, M4A, OGG — anything)
resp = requests.post(
    "https://api.vxid.dev/v1/subjects/verify",
    headers={"X-API-Key": "vx_live_YOUR_KEY"},
    files={"audio": open("call_recording.mp3", "rb")},
    data={"subject_id": "SUBJ-4821"},
)

Node.js

Use node-record-lpcm16 for mic capture, or send files with FormData.

import fs from 'fs';

// Send any audio file — format doesn't matter
const form = new FormData();
form.append('audio', fs.createReadStream('recording.webm'));
form.append('subject_id', 'SUBJ-4821');

const resp = await fetch('https://api.vxid.dev/v1/subjects/enroll', {
  method: 'POST',
  headers: { 'X-API-Key': 'vx_live_YOUR_KEY' },
  body: form,
});

// Or use base64 for browser-captured audio
const audioBuffer = fs.readFileSync('recording.webm');
const resp2 = await fetch('https://api.vxid.dev/v1/subjects/enroll/json', {
  method: 'POST',
  headers: {
    'X-API-Key': 'vx_live_YOUR_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    audio_base64: audioBuffer.toString('base64'),
    audio_format: 'webm',
    subject_id: 'SUBJ-4821',
  }),
});

Mobile (React Native / Expo)

iOS produces M4A/AAC, Android produces OGG or AMR. Send either — VXID handles both.

import { Audio } from 'expo-av';
import * as FileSystem from 'expo-file-system';

// Record audio
const recording = new Audio.Recording();
await recording.prepareToRecordAsync(
  Audio.RecordingOptionsPresets.HIGH_QUALITY
);
await recording.startAsync();

// ... wait for recording ...

await recording.stopAndUnloadAsync();
const uri = recording.getURI(); // file:///...recording.m4a

// Read as base64 and send
const base64 = await FileSystem.readAsStringAsync(uri, {
  encoding: FileSystem.EncodingType.Base64,
});

const resp = await fetch('https://api.vxid.dev/v1/subjects/enroll/json', {
  method: 'POST',
  headers: {
    'X-API-Key': 'vx_live_YOUR_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    audio_base64: base64,
    audio_format: 'm4a',  // iOS default; use 'ogg' for Android
    subject_id: 'SUBJ-4821',
  }),
});

Telephony (Twilio / Exotel)

Call recording APIs provide audio URLs. Download and forward to VXID.

# Download Twilio recording and send to VXID
import requests

# Twilio provides recording URL
recording_url = "https://api.twilio.com/2010-04-01/Accounts/.../Recordings/RE.../Recording.mp3"
audio = requests.get(recording_url).content

# Send directly to VXID — MP3 works fine
resp = requests.post(
    "https://api.vxid.dev/v1/subjects/verify",
    headers={"X-API-Key": "vx_live_YOUR_KEY"},
    files={"audio": ("recording.mp3", audio, "audio/mpeg")},
    data={"subject_id": "SUBJ-4821"},
)

ℹ

Key principle: VXID converts everything server-side. Your integration code never needs to install FFmpeg, handle audio codecs, or worry about sample rates. Capture audio in whatever format your platform provides, send it to VXID, and receive the verdict.

Enroll a subject

Enroll a subject's voiceprint. Upload an audio sample (minimum 10 seconds) and your reference ID. Enrollment is free — you're only billed for verifications.

Two endpoint variants — use whichever suits your platform:

Option A: File upload (multipart/form-data)

POST/v1/subjects/enroll

Best for: server-side integrations, CLI tools, file-based workflows.

Field	Type	Description
audio	file required	Any audio format: WAV, MP3, OGG, WebM, Opus, M4A, AAC, FLAC, AMR, CAF. Minimum 10 seconds.
subject_id	string required	Your unique reference ID. Max 255 chars.
subject_name	string optional	Display name for dashboards and reports.
context	JSON string optional	Industry-specific metadata. Stored and returned, never interpreted by VXID.

Option B: JSON with base64 audio

POST/v1/subjects/enroll/json

Best for: browsers (MediaRecorder), mobile apps, real-time systems.

{
  "audio_base64": "GkXfo59ChoEBQveBAo...",
  "audio_format": "webm",
  "subject_id": "SUBJ-4821",
  "subject_name": "Jane Doe",
  "context": {"department": "engineering"}
}

Field	Type	Description
audio_base64	string required	Base64-encoded audio bytes.
audio_format	string required	Format hint: `wav`, `webm`, `opus`, `ogg`, `m4a`, `aac`, `mp3`, `flac`, `amr`
subject_id	string required	Your unique reference ID.
subject_name	string optional	Display name.
context	object optional	Industry-specific metadata.

ℹ

You don't need to convert audio. VXID accepts any format and converts to 16kHz mono internally via FFmpeg. Send whatever your platform produces natively — WebM from browsers, M4A from iOS, AMR from telephony — and VXID handles the rest.

Response (both endpoints)

{
  "enrollment_id": "a1b2c3d4-...",
  "subject_id": "SUBJ-4821",
  "audio_duration_sec": 15.2,
  "embedding_dim": 192,
  "created_at": "2026-05-15T10:30:00Z",
  "message": "Voiceprint enrolled successfully"
}

Verify a speaker

Verify live audio against an enrolled voiceprint. Returns a verdict with per-layer scores. Pass session_id to add checks to an existing session for multi-check verification.

Option A: File upload (multipart/form-data)

POST/v1/subjects/verify

Field	Type	Description
audio	file required	Any audio format. Minimum 2 seconds.
subject_id	string required	The enrolled subject's reference ID.
session_id	string optional	Existing session ID for multi-check. Omit to create a new session.
context	JSON string optional	Metadata for this session.

Option B: JSON with base64 audio

POST/v1/subjects/verify/json

{
  "audio_base64": "GkXfo59ChoEBQveBAo...",
  "audio_format": "webm",
  "subject_id": "SUBJ-4821",
  "session_id": "sess_7f14...",
  "context": {"call_type": "interview"}
}

Response

See the quick start response above for the full structure. Key fields:

Field	Description
check.verdict	`PASS`, `FAIL`, or `REVIEW` for this individual check
check.layers[]	Per-layer scores: identity, authenticity, liveness
session_verdict	Overall session verdict (FAIL if any check fails)
session_id	Use this to add subsequent checks to the same session

List subjects

GET/v1/subjects?page=1&per_page=20&search=jane

Paginated list of enrolled subjects. Searchable by subject_id or subject_name.

Get subject detail

GET/v1/subjects/SUBJ-4821

Returns enrollment details, session count, and last verification timestamp.

Delete subject

DEL/v1/subjects/SUBJ-4821

Soft-deletes the voiceprint and zeros out the embedding vector. For GDPR/DPDPA right-to-delete compliance. This action is irreversible — the subject must be re-enrolled.

Multi-check sessions

A session groups multiple verification checks into one logical interaction. For example, a 30-minute exam verified every 5 minutes creates 1 session with 6 checks.

How it works

First check: Call /v1/subjects/verify without session_id. A new session is created and the session_id is returned.

Subsequent checks: Pass the session_id from the first response. Each check is added to the same session. Scores, counters, and verdict are updated.

Session verdict: FAIL if any check fails. REVIEW if any check is borderline. PASS if all checks pass.

Complete the session: Call /v1/sessions/{id}/complete when done. Returns billable minutes.

ℹ

You control how often to verify. VXID doesn't prescribe intervals — send audio whenever you decide. You're billed only for the audio you send.

List sessions

GET/v1/sessions?subject_id=SUBJ-4821&verdict=FAIL&status=completed

Filter by subject_id, verdict (PASS/FAIL/REVIEW), or status (active/completed).

Get session detail

GET/v1/sessions/sess_7f14a...

Returns the full session including all individual checks with per-check scores and timestamps.

Complete a session

POST/v1/sessions/sess_7f14a.../complete

Marks the session as completed. No more checks can be added. Returns final verdict and billable minutes.

{
  "session_id": "sess_7f14a...",
  "status": "completed",
  "verdict": "PASS",
  "checks_count": 6,
  "total_audio_sec": 60.0,
  "billable_minutes": 1.0
}

How billing works

VXID bills by audio processed, not by session time or API calls.

What	Price
Enrollment	Free
Verification audio	$0.02 per minute of audio processed
Free tier	100 minutes/month

You send a 10-second audio chunk → billed for 0.167 minutes → $0.0033. Your monthly bill = total audio minutes × $0.02. You control the total by choosing your verification interval.

Get usage

GET/v1/usage?period=2026-05

Returns audio seconds, billable minutes, estimated cost, and event counts for the billing period.

{
  "billing_period": "2026-05",
  "total_audio_seconds": 3600.0,
  "total_audio_minutes": 60.0,
  "billable_minutes": 60.0,
  "enrollments": 45,
  "verifications": 312,
  "estimated_cost_cents": 120
}

Choosing verification frequency

You decide how often to verify based on your risk tolerance. VXID is stateless — each check is independent. The table below is guidance, not prescription.

Risk level	Interval	Example use cases	30-min session cost
Critical	Every 10–30 sec	Financial transactions, government ID	$0.60
High	Every 1–2 min	Interviews, medical consultations	$0.30–0.60
Standard	Every 5 min	Online exams, certifications	$0.06
Periodic	Every 10–15 min	Long shift monitoring, extended calls	$0.03–0.04

★

Pro tip: Verify more frequently at the start of a session (when proxy swaps are most likely) and reduce frequency over time. For a 2-hour exam: every 30 seconds for the first 5 minutes, then every 5 minutes for the rest. Total cost: ~$0.22 instead of $2.40 for continuous monitoring.

Questions? Reach us at [email protected]