The unified audio layer for AI agents and Voice AI applications
Capture real-time microphone audio, play to speakers, or pipe anywhere (voice agents, WebSockets, or files) using Python, Node.js, or Rust. Built-in voice activity detection. Zero system dependencies. Zero setup.
Install with one command:
Then start streaming:
import decibri with decibri.Microphone(sample_rate=16000) as mic: for chunk in mic: print(f"Got {len(chunk)} bytes") break
const Decibri = require('decibri'); const mic = new Decibri({ sampleRate: 16000 }); mic.on('data', (chunk) => console.log(`Got ${chunk.length} bytes`)); setTimeout(() => mic.stop(), 5000);
use decibri::capture::{AudioCapture, CaptureConfig}; fn main() -> Result<(), Box<dyn std::error::Error>> { let capture = AudioCapture::new(CaptureConfig::default())?; let stream = capture.start()?; while let Ok(Some(chunk)) = stream.next_chunk(None) { println!("Got {} bytes", chunk.data.len()); } Ok(()) }
Real-time audio capture and playback is harder than it should be
Designed for real-time systems
Built for real-time audio. Pre-built binaries for Python, Node.js, and Rust mean no compilers, no system audio libraries, and no setup.
Cross-platform
macOS, Windows, and Linux with pre-built binaries. No build tools required on any platform.
Direct capture
100ms audio chunks by default (1600 frames at 16kHz). cpal captures directly from the OS audio layer with no subprocess, no shell, and no intermediate encoding.
Stream-native
Python iterators, Node.js Readable streams, and Rust channels. Pipe to files, WebSockets, or voice agents using each language's native idioms.
Zero dependencies
No system audio libraries, no build tools, no install scripts. Rust and cpal compiled into a single package per language.
Type-safe
Bundled type definitions for Python (typing stubs), Node.js (TypeScript .d.ts), and Rust (native types). Full IDE autocomplete and inline documentation.
Configurable
Sample rate, channels, frames per buffer, device selection by index or name, int16 or float32 output, and optional voice activity detection with speech/silence events.
Audio output
Play audio through the system speaker. Decibri's Speaker API works as a standard write target you can pipe capture into for full duplex audio.
ML voice detection
Bundled Silero VAD v5 model for accurate speech detection in noisy environments. Runs locally in Rust via ONNX Runtime with no cloud API needed.
Browser support
The Node.js package ships with browser support out of the box. Conditional exports serve an AudioWorklet implementation when bundled for browsers.
Pre-built binaries, zero setup
Pre-compiled native binaries ship inside the package. No build tools, no compilation, no post-install downloads.
Built for real-time voice applications
Record audio to disk
Decibri provides a one-line record_to_file() helper for the simple case, and full streaming control when you need it. Set the sample rate to match your target format. No encoding step, no intermediate buffers.
import decibri decibri.record_to_file("capture.wav", duration_seconds=10, sample_rate=16000)
const Decibri = require('decibri'); const fs = require('fs'); const mic = new Decibri({ sampleRate: 16000, channels: 1 }); mic.pipe(fs.createWriteStream('capture.raw'));
use decibri::capture::{AudioCapture, CaptureConfig}; use std::fs::File; use std::io::Write; fn main() -> Result<(), Box<dyn std::error::Error>> { let mut file = File::create("capture.raw")?; let capture = AudioCapture::new(CaptureConfig::default())?; let stream = capture.start()?; while let Ok(Some(chunk)) = stream.next_chunk(None) { file.write_all(&chunk.data)?; } Ok(()) }
Audio infrastructure for real-time applications
Decibri sits between your application and the operating system. It captures from microphones, plays to speakers, and runs voice activity detection, all in real-time. Use it when you need predictable, low-latency audio I/O without managing system dependencies or platform differences.
Decibri is built for:
- ✓Real-time microphone capture in Python, Node.js, or Rust applications
- ✓Streaming PCM audio to speech-to-text services, voice agents, or WebSocket pipelines
- ✓Local wake word detection with bundled Silero VAD
- ✓Recording audio to file for ASR batch jobs or CI smoke tests
- ✓Cross-platform deployment (Windows, macOS, Linux) without system audio dependencies
- ✓Server-side audio in voice agent backends, CLI tools, and edge devices
Works with your existing stack
Decibri outputs raw 16-bit PCM, the standard format expected by speech and audio processing engines. See the integration guides for working examples.
AssemblyAI
Real-time cloud speech-to-text with turn-based transcription
CloudAWS Transcribe
Real-time cloud speech-to-text with Amazon Transcribe streaming
CloudAzure
Real-time cloud speech-to-text with Azure Speech-to-Text
CloudDeepgram
Real-time cloud speech-to-text with Deepgram's Nova-3 model
CloudReal-time cloud speech-to-text with Google Cloud Speech-to-Text
CloudMistral
Real-time cloud speech-to-text with Voxtral open-weights model
CloudOpenAI
Real-time cloud speech-to-text with OpenAI's Realtime API
LocalSherpa-ONNX
Speech-to-text, keyword spotting, and voice activity detection, all offline
LocalWhisper.cpp
Local speech-to-text transcription with OpenAI's Whisper model
Decibri outputs raw PCM, the standard format used by Vosk, openWakeWord, and any audio processing engine.
Products powered by Decibri
Tools that use decibri as their audio capture layer. From voice agents to wake phrase detection.
Voice-powered terminal agent. Speak commands, get answers. Fully offline using whisper.cpp for local transcription and Ollama for local LLM inference. No cloud, no API keys.
Voice activation for code editors. Say "Hey Claude" and Claude opens. Say "Hey Copilot" and Copilot opens. Fully local wake phrase detection using sherpa-onnx. Zero config.
Common questions
16-bit signed integer PCM, little-endian by default. A 32-bit float format is also available. The default is the raw format expected by most speech and wake-word engines including Vosk, sherpa-onnx, and whisper.cpp.
No. Decibri uses cpal for direct OS audio access. There are no system audio libraries to install.
No. Pre-built binaries ship with the package on every supported platform: Python (wheels), Node.js (optional platform packages), and Rust (compiled at install time via cargo). No compilers, no node-gyp, no Visual Studio Build Tools, no Xcode required.
Yes. Decibri works in both Node.js and the browser from a single package. When bundled for the browser (webpack, vite, etc.), conditional exports serve the browser implementation automatically. No separate package needed. See the browser documentation.
Yes. Use Decibri.devices() to list available input devices, then pass the device index or a name substring to the constructor options.
Node.js 18 and above.
Apache-2.0.
Decibri supports Python 3.10 and above on Windows, macOS, and Linux. Prebuilt wheels are published for Linux x64, Linux ARM64, macOS Apple Silicon, and Windows x64.
Yes. Decibri ships AsyncMicrophone, AsyncSpeaker, and async_record_to_file() as proper async classes that work as async context managers with async for iteration. Audio capture runs on a separate thread, so the event loop isn't blocked. The recommended pattern is async with await decibri.AsyncMicrophone.open(...) as mic: async for chunk in mic: ... The open() factory dispatches the optional Silero VAD model load off the event loop.
Rust gives decibri direct OS audio access via cpal, a cross-platform Rust audio library. The Rust core compiles to native code on every supported platform, so the same audio engine powers the Python, Node.js, and Rust packages. Memory safety enables real-time audio code without manual buffer management. The result: predictable low-latency audio with zero system dependencies.
Start streaming audio in minutes
One install. One import. Real-time microphone audio.