VXID — Verify voice. Match person.

The entire API

Two calls. That's it.

Send us voice. We tell you if it matches. No SDKs to install, no models to train, no infrastructure to manage.

Enroll a voiceprint

cURL

Python

Node.js

# Enroll a voiceprint
curl https://api.vxid.dev/v1/subjects/enroll \
  -H "X-API-Key: vx_live_k8x..." \
  -F audio=@voice_sample.wav \
  -F subject_id="SUBJ-4821"

import requests

response = requests.post(
    "https://api.vxid.dev/v1/subjects/enroll",
    headers={"X-API-Key": "vx_live_k8x..."},
    files={"audio": open("voice_sample.wav", "rb")},
    data={"subject_id": "SUBJ-4821"},
)
print(response.json())

const form = new FormData();
form.append("audio", fs.createReadStream("voice_sample.wav"));
form.append("subject_id", "SUBJ-4821");

const res = await fetch("https://api.vxid.dev/v1/subjects/enroll", {
  method: "POST",
  headers: { "X-API-Key": "vx_live_k8x..." },
  body: form,
});
console.log(await res.json());

body := &bytes.Buffer{}
writer := multipart.NewWriter(body)
part, _ := writer.CreateFormFile("audio", "voice_sample.wav")
io.Copy(part, audioFile)
writer.WriteField("subject_id", "SUBJ-4821")
writer.Close()

req, _ := http.NewRequest("POST",
    "https://api.vxid.dev/v1/subjects/enroll", body)
req.Header.Set("X-API-Key", "vx_live_k8x...")
req.Header.Set("Content-Type", writer.FormDataContentType())
resp, _ := http.DefaultClient.Do(req)

Verify the speaker

cURL

Python

Node.js

# Verify the speaker
curl https://api.vxid.dev/v1/subjects/verify \
  -H "X-API-Key: vx_live_k8x..." \
  -F audio=@live_audio.wav \
  -F subject_id="SUBJ-4821"

response = requests.post(
    "https://api.vxid.dev/v1/subjects/verify",
    headers={"X-API-Key": "vx_live_k8x..."},
    files={"audio": open("live_audio.wav", "rb")},
    data={
        "subject_id": "SUBJ-4821",
        "session_id": "sess_7f14...",  # optional: multi-check
    },
)
verdict = response.json()["check"]["verdict"]

const form = new FormData();
form.append("audio", fs.createReadStream("live_audio.wav"));
form.append("subject_id", "SUBJ-4821");
form.append("session_id", "sess_7f14..."); // optional

const res = await fetch("https://api.vxid.dev/v1/subjects/verify", {
  method: "POST",
  headers: { "X-API-Key": "vx_live_k8x..." },
  body: form,
});
const { check } = await res.json();

body := &bytes.Buffer{}
writer := multipart.NewWriter(body)
part, _ := writer.CreateFormFile("audio", "live_audio.wav")
io.Copy(part, audioFile)
writer.WriteField("subject_id", "SUBJ-4821")
writer.WriteField("session_id", "sess_7f14...") // optional
writer.Close()

req, _ := http.NewRequest("POST",
    "https://api.vxid.dev/v1/subjects/verify", body)
req.Header.Set("X-API-Key", "vx_live_k8x...")
req.Header.Set("Content-Type", writer.FormDataContentType())

Response

// Response
{
  "verdict": "PASS",
  "score": 0.84,
  "layers": {
    "identity": 0.87,
    "authenticity": 0.96,
    "liveness": 0.91
  },
  "session_id": "sess_7f14..."
}

Under the hood

Three layers of defense.

Every verification passes through three independent checks. Each layer catches what the others miss.

Identity

AI-powered voiceprint matching. Compares the live speaker against the enrolled voice identity. Language-agnostic. Works across accents, in any language.

Authenticity

Deepfake and synthetic speech detection. Catches voice cloning, text-to-speech, replay attacks, and AI-generated audio. Detects what human ears can't.

Liveness

Behavioral voice analysis. Breathing patterns, micro-hesitations, response latency — natural speech characteristics that synthetic speech can't replicate.

The science

What we hear vs. what you hear.

Human ears judge voice by surface features — pitch, accent, rhythm. VXID measures the physics underneath: the shape of your vocal tract, the resonance of your throat, the structure of your glottal pulse.

What humans hear

Pitch — high or low voice

Accent — regional, national

Speaking speed and rhythm

Tone — warm, harsh, nasal

Vocabulary and catchphrases

Emotional expression

A mimicry artist controls all of these. That's why they fool humans.

What VXID measures

Formant frequencies — resonances of the vocal tract

Harmonic structure — overtone patterns from vocal cords

Spectral envelope — energy distribution across frequencies

Glottal pulse shape — how vocal cords open and close

192-dimensional embedding — a mathematical voice fingerprint

These are determined by anatomy. No talent can change your throat length.

Can it be fooled?

We get asked this a lot. Here's the honest answer for every scenario, backed by published research.

Brothers and siblings

Siblings share some genetic vocal characteristics. But vocal tracts develop differently through years of different usage, diets, and environments. Distinguishable by the system even when humans hear a family resemblance.

Stranger

0.22

Sibling

0.42

Same person

threshold

0.85

Cannot fool VXID

Identical twins

The hardest case in biometrics. Identical twins share 100% DNA. Research shows impostor twins score 0.60–0.75 — close to threshold but still distinguishable, especially with longer audio samples (10+ seconds).

Twin (short)

0.68

Twin (10s+)

0.58

Same person

0.85

Edge case — longer samples help

Professional mimicry artist

A talented impressionist controls pitch, cadence, accent — everything humans perceive. But they cannot change their formant frequencies, determined by physical vocal tract anatomy. Research confirms even world-class impressionists cannot modify their formant loci.

Mimicry

0.35

Human judge

fooled

Humans are fooled. The AI is not.

Cannot fool VXID

Pitch shifter / voice changer

Simple pitch shifting moves the fundamental frequency but also shifts the formant structure unnaturally. The resulting embedding is different from both the original and the target.

Pitch shifted

0.28

Cannot fool VXID

The real threat

AI voice cloning is different.

Unlike human mimicry, voice cloning software digitally reconstructs the spectral characteristics of a voice. A good clone can score 0.85+ on identity matching, fooling Layer 1 alone. This is why VXID has three layers, not one.

Layer 1 — Identity

Voiceprint match

A high-quality clone replicates the spectral characteristics that match the enrolled voiceprint.

Clone passes (score: 0.85)

Layer 2 — Authenticity

Synthetic detection

Detects flattened harmonics, unnatural spectral transitions, and missing micro-variations in all real speech.

Clone caught

Layer 3 — Liveness

Behavioral analysis

Real-time cloning introduces ~200ms latency, lacks natural breathing, and can't replicate cognitive-load variations.

Clone caught

Three hard problems stacked — match the voiceprint, avoid synthetic artifacts, replicate behavioral patterns. All within 500ms. That's the attacker's challenge.

About VXID

Built for builders.

VXID is voice identity infrastructure from NeuralWeaves Technologies Private Limited, a deep-tech AI research company based in Bengaluru, India. We build foundational AI systems that solve real problems at global scale.

VXID exists because every platform that touches voice — hiring, banking, exams, telehealth, government services — needs a way to verify that the person speaking is who they claim to be. We provide that verification as a simple API, and let you decide where to use it.

Our approach: do one thing, do it well. Enroll a voiceprint. Verify the speaker. Three layers of defense — identity matching, deepfake detection, behavioral liveness. Everything else is yours to build.

Our principles

Infrastructure, not application

We don't build interview tools, proctoring software, or banking apps. We provide the voice verification layer that all of them need.

Honest about limitations

We publish our accuracy data, explain edge cases (like identical twins), and tell you exactly what our three layers catch — and what they don't.

Privacy by design

Voiceprints are irreversible mathematical vectors — the original audio can't be reconstructed. Designed to support GDPR and DPDPA requirements. Right-to-delete built in.

One price, everywhere

$0.02/minute of audio processed. No regional pricing, no hidden fees, no sales calls required. Same API, same rate, whether you're in Bengaluru or Berlin.

Verify voice.
Match person.

Two calls. That's it.

Three layers of defense.

Identity

Authenticity

Liveness

Your product. Our verification.

What we hear vs. what you hear.

What humans hear

What VXID measures

Can it be fooled?

Brothers and siblings

Identical twins

Professional mimicry artist

Pitch shifter / voice changer

AI voice cloning is different.

Voiceprint match

Synthetic detection

Behavioral analysis

Pay per minute.

Start verifying in minutes.

Built for builders.

Our principles

Let's talk.

Common reasons to reach out

Verify voice.Match person.

Two calls. That's it.

Three layers of defense.

Identity

Authenticity

Liveness

Your product. Our verification.

What we hear vs. what you hear.

What humans hear

What VXID measures

Can it be fooled?

Brothers and siblings

Identical twins

Professional mimicry artist

Pitch shifter / voice changer

AI voice cloning is different.

Voiceprint match

Synthetic detection

Behavioral analysis

Pay per minute.

Start verifying in minutes.

Built for builders.

Our principles

Let's talk.

Common reasons to reach out

Verify voice.
Match person.