Voice identity infrastructure

Verify voice.
Match person.

Two API calls. Enroll a voiceprint. Verify the speaker. We tell you if it's the same person. What you build with it is up to you.

POST /v1/subjects/verify waiting
live audio
Identity
0.87
Authenticity
0.96
Liveness
0.91
// Response { "verdict": "PASS", "confidence": 0.91, "layers": { "identity": 0.87, "authenticity": 0.96, "liveness": 0.91 } }
PASS 340ms
The entire API

Two calls. That's it.

Send us voice. We tell you if it matches. No SDKs to install, no models to train, no infrastructure to manage.

Enroll a voiceprint

cURL
Python
Node.js
Go
# Enroll a voiceprint curl https://api.vxid.dev/v1/subjects/enroll \ -H "X-API-Key: vx_live_k8x..." \ -F audio=@voice_sample.wav \ -F subject_id="SUBJ-4821"
import requests response = requests.post( "https://api.vxid.dev/v1/subjects/enroll", headers={"X-API-Key": "vx_live_k8x..."}, files={"audio": open("voice_sample.wav", "rb")}, data={"subject_id": "SUBJ-4821"}, ) print(response.json())
const form = new FormData(); form.append("audio", fs.createReadStream("voice_sample.wav")); form.append("subject_id", "SUBJ-4821"); const res = await fetch("https://api.vxid.dev/v1/subjects/enroll", { method: "POST", headers: { "X-API-Key": "vx_live_k8x..." }, body: form, }); console.log(await res.json());
body := &bytes.Buffer{} writer := multipart.NewWriter(body) part, _ := writer.CreateFormFile("audio", "voice_sample.wav") io.Copy(part, audioFile) writer.WriteField("subject_id", "SUBJ-4821") writer.Close() req, _ := http.NewRequest("POST", "https://api.vxid.dev/v1/subjects/enroll", body) req.Header.Set("X-API-Key", "vx_live_k8x...") req.Header.Set("Content-Type", writer.FormDataContentType()) resp, _ := http.DefaultClient.Do(req)

Verify the speaker

cURL
Python
Node.js
Go
# Verify the speaker curl https://api.vxid.dev/v1/subjects/verify \ -H "X-API-Key: vx_live_k8x..." \ -F audio=@live_audio.wav \ -F subject_id="SUBJ-4821"
response = requests.post( "https://api.vxid.dev/v1/subjects/verify", headers={"X-API-Key": "vx_live_k8x..."}, files={"audio": open("live_audio.wav", "rb")}, data={ "subject_id": "SUBJ-4821", "session_id": "sess_7f14...", # optional: multi-check }, ) verdict = response.json()["check"]["verdict"]
const form = new FormData(); form.append("audio", fs.createReadStream("live_audio.wav")); form.append("subject_id", "SUBJ-4821"); form.append("session_id", "sess_7f14..."); // optional const res = await fetch("https://api.vxid.dev/v1/subjects/verify", { method: "POST", headers: { "X-API-Key": "vx_live_k8x..." }, body: form, }); const { check } = await res.json();
body := &bytes.Buffer{} writer := multipart.NewWriter(body) part, _ := writer.CreateFormFile("audio", "live_audio.wav") io.Copy(part, audioFile) writer.WriteField("subject_id", "SUBJ-4821") writer.WriteField("session_id", "sess_7f14...") // optional writer.Close() req, _ := http.NewRequest("POST", "https://api.vxid.dev/v1/subjects/verify", body) req.Header.Set("X-API-Key", "vx_live_k8x...") req.Header.Set("Content-Type", writer.FormDataContentType())

Response

// Response { "verdict": "PASS", "score": 0.84, "layers": { "identity": 0.87, "authenticity": 0.96, "liveness": 0.91 }, "session_id": "sess_7f14..." }

Three layers of defense.

Every verification passes through three independent checks. Each layer catches what the others miss.

01

Identity

AI-powered voiceprint matching. Compares the live speaker against the enrolled voice identity. Language-agnostic. Works across accents, in any language.

02

Authenticity

Deepfake and synthetic speech detection. Catches voice cloning, text-to-speech, replay attacks, and AI-generated audio. Detects what human ears can't.

03

Liveness

Behavioral voice analysis. Breathing patterns, micro-hesitations, response latency — natural speech characteristics that synthetic speech can't replicate.

Your product. Our verification.

VXID is the identity layer underneath. What you build on top is up to you.

Banking & KYC
Hiring & Screening
Proctoring & Exams
Telehealth & Patient ID
VXID API api.vxid.dev
Phone calls
Web apps
Mobile apps
IoT devices
The science

What we hear vs. what you hear.

Human ears judge voice by surface features — pitch, accent, rhythm. VXID measures the physics underneath: the shape of your vocal tract, the resonance of your throat, the structure of your glottal pulse.

What humans hear

Pitch — high or low voice
Accent — regional, national
Speaking speed and rhythm
Tone — warm, harsh, nasal
Vocabulary and catchphrases
Emotional expression
A mimicry artist controls all of these. That's why they fool humans.

What VXID measures

Formant frequencies — resonances of the vocal tract
Harmonic structure — overtone patterns from vocal cords
Spectral envelope — energy distribution across frequencies
Glottal pulse shape — how vocal cords open and close
192-dimensional embedding — a mathematical voice fingerprint
These are determined by anatomy. No talent can change your throat length.

Can it be fooled?

We get asked this a lot. Here's the honest answer for every scenario, backed by published research.

Brothers and siblings

Siblings share some genetic vocal characteristics. But vocal tracts develop differently through years of different usage, diets, and environments. Distinguishable by the system even when humans hear a family resemblance.

Stranger
0.22
Sibling
0.42
Same person
threshold
0.85
Cannot fool VXID

Identical twins

The hardest case in biometrics. Identical twins share 100% DNA. Research shows impostor twins score 0.60–0.75 — close to threshold but still distinguishable, especially with longer audio samples (10+ seconds).

Twin (short)
0.68
Twin (10s+)
0.58
Same person
0.85
Edge case — longer samples help

Professional mimicry artist

A talented impressionist controls pitch, cadence, accent — everything humans perceive. But they cannot change their formant frequencies, determined by physical vocal tract anatomy. Research confirms even world-class impressionists cannot modify their formant loci.

Mimicry
0.35
Human judge
fooled

Humans are fooled. The AI is not.

Cannot fool VXID

Pitch shifter / voice changer

Simple pitch shifting moves the fundamental frequency but also shifts the formant structure unnaturally. The resulting embedding is different from both the original and the target.

Pitch shifted
0.28
Cannot fool VXID

AI voice cloning is different.

Unlike human mimicry, voice cloning software digitally reconstructs the spectral characteristics of a voice. A good clone can score 0.85+ on identity matching, fooling Layer 1 alone. This is why VXID has three layers, not one.

Layer 1 — Identity

Voiceprint match

A high-quality clone replicates the spectral characteristics that match the enrolled voiceprint.

Clone passes (score: 0.85)
Layer 2 — Authenticity

Synthetic detection

Detects flattened harmonics, unnatural spectral transitions, and missing micro-variations in all real speech.

Clone caught
Layer 3 — Liveness

Behavioral analysis

Real-time cloning introduces ~200ms latency, lacks natural breathing, and can't replicate cognitive-load variations.

Clone caught

Three hard problems stacked — match the voiceprint, avoid synthetic artifacts, replicate behavioral patterns. All within 500ms. That's the attacker's challenge.

Pricing

Pay per minute.

Billed by audio processed. You control the frequency. Enrollment is free. Start building in minutes.

Free
$0
100 minutes / month
  • 10 voiceprints
  • 10 verifications / day
  • Identity layer
  • Test API keys
  • Community support
Start building
Enterprise
Custom
volume pricing
  • Dedicated infrastructure
  • SLA guarantees
  • Cross-session fraud detection
  • On-premise deployment
Talk to us

Start verifying in minutes.

Get an API key. Send your first voice sample. Receive your first verdict. The rest is yours to build.

Get your API key

Built for builders.

VXID is voice identity infrastructure from NeuralWeaves Technologies Private Limited, a deep-tech AI research company based in Bengaluru, India. We build foundational AI systems that solve real problems at global scale.

VXID exists because every platform that touches voice — hiring, banking, exams, telehealth, government services — needs a way to verify that the person speaking is who they claim to be. We provide that verification as a simple API, and let you decide where to use it.

Our approach: do one thing, do it well. Enroll a voiceprint. Verify the speaker. Three layers of defense — identity matching, deepfake detection, behavioral liveness. Everything else is yours to build.

Our principles

Infrastructure, not application

We don't build interview tools, proctoring software, or banking apps. We provide the voice verification layer that all of them need.

Honest about limitations

We publish our accuracy data, explain edge cases (like identical twins), and tell you exactly what our three layers catch — and what they don't.

Privacy by design

Voiceprints are irreversible mathematical vectors — the original audio can't be reconstructed. Designed to support GDPR and DPDPA requirements. Right-to-delete built in.

One price, everywhere

$0.02/minute of audio processed. No regional pricing, no hidden fees, no sales calls required. Same API, same rate, whether you're in Bengaluru or Berlin.

Get in touch

Let's talk.

Whether you're evaluating VXID for your platform, need enterprise pricing, or just want to understand how voice identity verification works — we're here.

Company

NeuralWeaves Technologies Pvt Ltd

Office

WeWork Prestige Cube, Site No. 26 Laskar,
Hosur Rd, Adugodi, Bengaluru,
Karnataka 560030, India

Common reasons to reach out

I want to integrate VXID into my platform

I need enterprise pricing for high volume

I want on-premise or dedicated deployment

I have a partnership or reseller inquiry

I want to discuss compliance or security

Send us an email