Getting started with Decibri

Decibri captures audio and gives you PCM chunks to work with. Install the package, open a microphone, and read chunks in your language of choice. This page walks the Python and Node.js workflows side by side. Tabs above each code block switch between them. For browser usage see the Browser API, and for the command-line tool see CLI API.

Install

$ pip install decibri

The install command above switches with the language tabs below. Pre-built binaries are included for Windows x64, macOS arm64, and Linux x64/arm64. No build tools or system audio libraries are needed. The Python wheel requires Python 3.10 or newer.

Quick start

Capture microphone audio and log the size of each chunk:

import decibri

with decibri.Microphone(sample_rate=16000, channels=1) as mic:
    for chunk in mic:
        # chunk is bytes of 16-bit PCM audio
        print(f"Received {len(chunk)} bytes")
        break  # exit after first chunk for demo

Each chunk contains 100 ms of audio by default (1,600 frames at 16 kHz). The output format is 16-bit signed integer PCM, little-endian, which is the standard input expected by most speech engines. In Python, iterate the Microphone directly; the with block calls stop() on exit. In Node.js, subscribe to the 'data' event and call mic.stop() when you are done.

Record audio to a file

Decibri ships a one-line helper in Python that captures microphone audio straight to a 16-bit PCM WAV file. In Node.js, the Microphone instance is a standard Readable stream, so you can pipe it into any Writable stream including fs.createWriteStream() for a raw PCM file.

import decibri

decibri.record_to_file("capture.wav", duration_seconds=10, sample_rate=44100, channels=2)

The Python helper bounds the recording by duration_seconds and writes a WAV file with header. The Node.js example writes a raw PCM stream that runs until the process exits or the writable stream is closed.

Select a microphone

List available input devices and choose one by index or name:

import decibri

# List all input devices
devices = decibri.input_devices()
print(devices)

# Select by index
mic = decibri.Microphone(device=2, sample_rate=16000, channels=1)

# Or select by name substring (case-insensitive)
mic2 = decibri.Microphone(device="USB", sample_rate=16000, channels=1)

Voice activity detection

Enable the built-in VAD based on RMS energy thresholding. In Python, check the mic.is_speaking property on each chunk; in Node.js, subscribe to the 'speech' and 'silence' events.

import decibri

with decibri.Microphone(
    sample_rate=16000,
    channels=1,
    vad="energy",
    vad_threshold=0.01,
    vad_holdoff_ms=300,
) as mic:
    for chunk in mic:
        if mic.is_speaking:
            print(f"Speech detected (score {mic.vad_score:.3f})")
        else:
            print("Silence")
        break  # exit after first chunk for demo

For more accurate detection in noisy environments, use the built-in Silero VAD (see below).

Silero voice activity detection

Decibri bundles the Silero VAD v5 ONNX model for ML-based speech detection. Silero is more accurate than energy-based VAD in noisy environments. Set vad="silero" in Python or vad: 'silero' in Node.js. The model runs locally via ONNX Runtime with no cloud API needed.

import decibri

with decibri.Microphone(
    sample_rate=16000,
    channels=1,
    vad="silero",
) as mic:
    for chunk in mic:
        if mic.is_speaking:
            print(f"Speech detected (score {mic.vad_score:.3f})")
        else:
            print("Silence")
        break  # exit after first chunk for demo

In Silero mode, the default threshold changes from 0.01 to 0.5 (probability scale 0.0 to 1.0). Override it by setting vad_threshold in Python or vadThreshold in Node.js.

The bundled model is auto-resolved from the package. To use a custom model, pass model_path (Python) or modelPath (Node.js) pointing to your .onnx file.

Audio output

Decibri's audio output uses the same native audio backend as capture. In Python the class is decibri.Speaker; in Node.js it is Speaker, a standard Writable stream.

import decibri

with decibri.Speaker(sample_rate=16000, channels=1) as spk:
    spk.write(pcm_bytes)
    spk.drain()  # block until all buffered audio has played

The sample rate, channel count, and sample format must match the audio data you are writing. Use stop() to immediately stop playback and discard pending audio. To play everything currently buffered, call drain() in Python (the speaker stays open and can be written to again, and closes at the end of the with block) or end() in Node.js (the stream finishes and emits 'finish').

Full duplex

Capture and play back simultaneously: read from the microphone and write each chunk to the speaker.

import decibri

with decibri.Microphone(sample_rate=16000, channels=1) as mic, \
     decibri.Speaker(sample_rate=16000, channels=1) as spk:
    for chunk in mic:
        spk.write(chunk)  # hear yourself in real time

Platform support

Pre-built binaries ship inside the npm package. No build tools, no compilation, no post-install downloads.

Platform Architecture Audio backend
Windows 11 x64 WASAPI
macOS arm64 (Apple Silicon) CoreAudio
Linux x64 ALSA
Linux arm64 ALSA

If no pre-built binary is available for your platform, installation will fail. There is no source build fallback.