Voice activity detection (VAD)

Decibri ships two VAD implementations built in. No cloud API, no extra install, no separate model download.

What’s built in

RMS energy detector. Fast, minimal CPU, works well for clean audio. Select with the 'energy' mode.
Silero v5 neural model (bundled). More accurate in noisy environments, background music, or multi-speaker scenarios. Select with the 'silero' mode.

The energy detector is a lightweight RMS computation; the Silero neural model runs locally via ONNX Runtime. The Silero ONNX model (~2.3 MB) ships inside the decibri package on both PyPI and npm. No separate download, no sherpa-onnx dependency.

Quick example

import decibri

# RMS energy detector. Fast, clean-audio use cases.
mic_rms = decibri.Microphone(vad="energy")

# Silero neural model. Better accuracy under noise / music / multi-speaker.
mic_silero = decibri.Microphone(vad="silero")

# Python exposes VAD state as properties.
with mic_silero:
    for chunk in mic_silero:
        if mic_silero.is_speaking:
            print("[speech]")

const { Microphone } = require('decibri');

// RMS energy detector. Fast, clean-audio use cases.
const micRms = new Microphone({ vad: 'energy' });

// Silero neural model. Better accuracy under noise / music / multi-speaker.
const micSilero = new Microphone({ vad: 'silero' });

// Node emits 'speech' / 'silence' events.
micSilero.on('speech',  () => console.log('[speech start]'));
micSilero.on('silence', () => console.log('[speech end]'));

Learn more

Silero VAD

Full walkthrough of the bundled neural VAD, including configuration options and notes on using sherpa-onnx separately if you need its VAD API directly.

VAD configuration options are documented in full in the Python and Node.js API references.

Voice activity detection (VAD)

What’s built in

Quick example

Learn more

Silero VAD

Related