This documentation is here to help you get up and running with decibri and integrate it into your voice and audio applications. Decibri is a cross-platform audio capture and output library available as a Python package, a Node.js package, a Rust library, a browser-side AudioWorklet, and a standalone CLI. The Rust core is built on cpal with pre-built binaries. One audio engine, five ways to use it.
Decibri ships audio to three kinds of integrations: speech-to-text, voice activity detection, and keyword spotting. Nine STT providers are supported today (seven cloud, two local), plus built-in VAD and a local KWS engine.
Cloud speech-to-text with AssemblyAI's Universal-3 Pro streaming model.
Cloud speech-to-text with AWS Transcribe streaming over HTTP/2.
Cloud speech-to-text with Azure AI Speech using PushAudioInputStream.
Cloud speech-to-text with Deepgram's Nova-3 model over WebSocket.
Cloud speech-to-text with Google Cloud Speech-to-Text. The simplest integration: one pipe() call.
Cloud speech-to-text with Mistral's Voxtral open-weights model (Apache 2.0).
Stream audio to OpenAI's Realtime API for cloud speech-to-text. 24 kHz sample rate, raw WebSocket.
Real-time local transcription with a streaming Zipformer model. No API key, no network.
Local buffered transcription with OpenAI's Whisper model via a native addon. No API key, no network.
Decibri surfaces as four runtime bindings with the same audio backend and chunk semantics.
Sync and async Microphone and Speaker, voice activity detection, asyncio support, and exception hierarchy. Python 3.10 and above.
Rust native addon via napi-rs. Provides a Readable stream and standard Node.js events.
AudioWorklet implementation via conditional exports. Same npm package, same API shape, async permission flow.
Standalone statically-linked binary: decibri-cli. Capture WAV, play WAV, enumerate audio devices.