Does decibri work in the browser?

Yes. Decibri works in both Node.js and the browser from a single package. When bundled for the browser, conditional exports serve the browser implementation automatically. See https://decibri.com/docs/apis/browser for browser documentation.

The unified audio layer for AI agents and Voice AI applications

Capture real-time microphone audio, play to speakers, or pipe anywhere (voice agents, WebSockets, or files) using Python, Node.js, or Rust. Built-in voice activity detection. Zero system dependencies. Zero setup.

Get Started View on GitHub

Install with one command:

$ pip install decibri

Then start streaming:

import decibri

with decibri.Microphone(sample_rate=16000) as mic:
    for chunk in mic:
        print(f"Got {len(chunk)} bytes")
        break

const Decibri = require('decibri');

const mic = new Decibri({ sampleRate: 16000 });
mic.on('data', (chunk) => console.log(`Got ${chunk.length} bytes`));
setTimeout(() => mic.stop(), 5000);

use decibri::capture::{AudioCapture, CaptureConfig};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let capture = AudioCapture::new(CaptureConfig::default())?;
    let stream = capture.start()?;
    while let Ok(Some(chunk)) = stream.next_chunk(None) {
        println!("Got {} bytes", chunk.data.len());
    }
    Ok(())
}

The Problem

Real-time audio capture and playback is harder than it should be

Requires system audio tools, build tools, or both

Inconsistent install experience across Python, Node.js, and Rust

Inconsistent runtime behavior across macOS, Windows, and Linux

Painful setup for something that should be simple

Why decibri

Designed for real-time systems

Built for real-time audio. Pre-built binaries for Python, Node.js, and Rust mean no compilers, no system audio libraries, and no setup.

Cross-platform

macOS, Windows, and Linux with pre-built binaries. No build tools required on any platform.

Direct capture

100ms audio chunks by default (1600 frames at 16kHz). cpal captures directly from the OS audio layer with no subprocess, no shell, and no intermediate encoding.

Stream-native

Python iterators, Node.js Readable streams, and Rust channels. Pipe to files, WebSockets, or voice agents using each language's native idioms.

Zero dependencies

No system audio libraries, no build tools, no install scripts. Rust and cpal compiled into a single package per language.

Type-safe

Bundled type definitions for Python (typing stubs), Node.js (TypeScript .d.ts), and Rust (native types). Full IDE autocomplete and inline documentation.

Configurable

Sample rate, channels, frames per buffer, device selection by index or name, int16 or float32 output, and optional voice activity detection with speech/silence events.

Audio output

Play audio through the system speaker. Decibri's Speaker API works as a standard write target you can pipe capture into for full duplex audio.

ML voice detection

Bundled Silero VAD v5 model for accurate speech detection in noisy environments. Runs locally in Rust via ONNX Runtime with no cloud API needed.

Browser support

The Node.js package ships with browser support out of the box. Conditional exports serve an AudioWorklet implementation when bundled for browsers.

Platform Support

Pre-built binaries, zero setup

Pre-compiled native binaries ship inside the package. No build tools, no compilation, no post-install downloads.

Windows 11

x64

macOS

arm64 (Apple Silicon)

Linux

x64

Linux

arm64

Use Cases

Built for real-time voice applications

Real-time transcription

Wake word detection

Voice agents & assistants

Streaming audio pipelines

Speech-to-text engines

Audio monitoring

Record audio to disk

Decibri provides a one-line record_to_file() helper for the simple case, and full streaming control when you need it. Set the sample rate to match your target format. No encoding step, no intermediate buffers.

import decibri

decibri.record_to_file("capture.wav", duration_seconds=10, sample_rate=16000)

const Decibri = require('decibri');
const fs = require('fs');

const mic = new Decibri({ sampleRate: 16000, channels: 1 });
mic.pipe(fs.createWriteStream('capture.raw'));

use decibri::capture::{AudioCapture, CaptureConfig};
use std::fs::File;
use std::io::Write;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut file = File::create("capture.raw")?;
    let capture = AudioCapture::new(CaptureConfig::default())?;
    let stream = capture.start()?;
    while let Ok(Some(chunk)) = stream.next_chunk(None) {
        file.write_all(&chunk.data)?;
    }
    Ok(())
}

When to use decibri

Audio infrastructure for real-time applications

Decibri sits between your application and the operating system. It captures from microphones, plays to speakers, and runs voice activity detection, all in real-time. Use it when you need predictable, low-latency audio I/O without managing system dependencies or platform differences.

Decibri is built for:

✓Real-time microphone capture in Python, Node.js, or Rust applications
✓Streaming PCM audio to speech-to-text services, voice agents, or WebSocket pipelines
✓Local wake word detection with bundled Silero VAD
✓Recording audio to file for ASR batch jobs or CI smoke tests
✓Cross-platform deployment (Windows, macOS, Linux) without system audio dependencies
✓Server-side audio in voice agent backends, CLI tools, and edge devices

Integrations

Works with your existing stack

Decibri outputs raw 16-bit PCM, the standard format expected by speech and audio processing engines. See the integration guides for working examples.

Cloud

AssemblyAI

Real-time cloud speech-to-text with turn-based transcription

Cloud

AWS Transcribe

Real-time cloud speech-to-text with Amazon Transcribe streaming

Cloud

Azure

Real-time cloud speech-to-text with Azure Speech-to-Text

Cloud

Deepgram

Real-time cloud speech-to-text with Deepgram's Nova-3 model

Cloud

Google

Real-time cloud speech-to-text with Google Cloud Speech-to-Text

Cloud

Mistral

Real-time cloud speech-to-text with Voxtral open-weights model

Cloud

OpenAI

Real-time cloud speech-to-text with OpenAI's Realtime API

Local

Sherpa-ONNX

Speech-to-text, keyword spotting, and voice activity detection, all offline

Local

Whisper.cpp

Local speech-to-text transcription with OpenAI's Whisper model

Decibri outputs raw PCM, the standard format used by Vosk, openWakeWord, and any audio processing engine.

View all docs →

Built with decibri

Products powered by Decibri

Tools that use decibri as their audio capture layer. From voice agents to wake phrase detection.

voxagent

Voice-powered terminal agent. Speak commands, get answers. Fully offline using whisper.cpp for local transcription and Ollama for local LLM inference. No cloud, no API keys.

decibri + whisper.cpp + Ollama

Wake Word

Voice activation for code editors. Say "Hey Claude" and Claude opens. Say "Hey Copilot" and Copilot opens. Fully local wake phrase detection using sherpa-onnx. Zero config.

decibri + sherpa-onnx + VS Code API

FAQ

Common questions

16-bit signed integer PCM, little-endian by default. A 32-bit float format is also available. The default is the raw format expected by most speech and wake-word engines including Vosk, sherpa-onnx, and whisper.cpp.

No. Decibri uses cpal for direct OS audio access. There are no system audio libraries to install.

No. Pre-built binaries ship with the package on every supported platform: Python (wheels), Node.js (optional platform packages), and Rust (compiled at install time via cargo). No compilers, no node-gyp, no Visual Studio Build Tools, no Xcode required.

Yes. Decibri works in both Node.js and the browser from a single package. When bundled for the browser (webpack, vite, etc.), conditional exports serve the browser implementation automatically. No separate package needed. See the browser documentation.

Yes. Use Decibri.devices() to list available input devices, then pass the device index or a name substring to the constructor options.

Node.js 18 and above.

Apache-2.0.

Decibri supports Python 3.10 and above on Windows, macOS, and Linux. Prebuilt wheels are published for Linux x64, Linux ARM64, macOS Apple Silicon, and Windows x64.

Yes. Decibri ships AsyncMicrophone, AsyncSpeaker, and async_record_to_file() as proper async classes that work as async context managers with async for iteration. Audio capture runs on a separate thread, so the event loop isn't blocked. The recommended pattern is async with await decibri.AsyncMicrophone.open(...) as mic: async for chunk in mic: ... The open() factory dispatches the optional Silero VAD model load off the event loop.

Rust gives decibri direct OS audio access via cpal, a cross-platform Rust audio library. The Rust core compiles to native code on every supported platform, so the same audio engine powers the Python, Node.js, and Rust packages. Memory safety enables real-time audio code without manual buffer management. The result: predictable low-latency audio with zero system dependencies.

Start streaming audio in minutes

One install. One import. Real-time microphone audio.

Get Started

View on PyPI View on npm View on crates.io

The unified audio layer for AI agents and Voice AI applications

Real-time audio capture and playback is harder than it should be

Designed for real-time systems

Cross-platform

Direct capture

Stream-native

Zero dependencies

Type-safe

Configurable

Audio output

ML voice detection

Browser support

Pre-built binaries, zero setup

Built for real-time voice applications

Record audio to disk

Audio infrastructure for real-time applications

Decibri is built for:

Works with your existing stack

AssemblyAI

AWS Transcribe

Azure

Deepgram

Google

Mistral

OpenAI

Sherpa-ONNX

Whisper.cpp

Products powered by Decibri

Common questions

What audio format does decibri output?

Does decibri require any system audio tools?

Do I need build tools to install decibri?

Does decibri work in the browser?

Can I select a specific microphone?

What Node.js versions are supported?

What license is decibri released under?

What Python versions are supported?

Does decibri support asyncio?

Why is decibri written in Rust?

Start streaming audio in minutes