Python API

Decibri ships a native Python package with synchronous Microphone and Speaker classes plus matching AsyncMicrophone and AsyncSpeaker classes for asyncio. Written in Rust (via PyO3 / abi3) with pre-built wheels for Python 3.10 and newer. For installation and first capture, see Getting started.

Quickstart

Three ways to get audio into and out of Python, in order of increasing control.

Install

Recommended with uv:

$ uv pip install decibri

Or with pip:

$ pip install decibri

For NumPy ndarray support (see Audio format):

$ pip install decibri[numpy]

Record to a file in one line

import decibri

decibri.record_to_file("output.wav", duration_seconds=10)

Captures 10 seconds of microphone audio to a 16-bit PCM WAV file at 16 kHz mono. No async, no streaming, no setup.

Stream chunks with a `with` block

import decibri

with decibri.Microphone(sample_rate=16000) as mic:
    for chunk in mic:
        print(f"Got {len(chunk)} bytes")
        break

Open the system microphone, iterate raw 16-bit PCM chunks, break after the first. Replace break with your processing pipeline.

Async parallel

import asyncio
import decibri

async def main():
    async with await decibri.AsyncMicrophone.open(sample_rate=16000) as mic:
        async for chunk in mic:
            print(f"Got {len(chunk)} bytes")
            break

asyncio.run(main())

Same loop, but on the event loop. Use this in voice agents or websocket pipelines.

Compatibility

Python versions	Platforms
3.10, 3.11, 3.12, 3.13, 3.14	Linux x64, Linux ARM64, macOS Apple Silicon, Windows x64

Pre-built wheels are published for every supported platform. pip install decibri fetches a binary wheel; no Rust toolchain, no C compiler, and no system audio headers are required at install time.

Wheels are built against the CPython stable ABI (abi3) with a 3.10 floor, so a single wheel per platform serves every supported interpreter version. New CPython releases work without a new decibri release as long as the stable ABI is preserved.

Microphone (sync)

Primary capture surface for synchronous code. Construct an instance, enter a with block (or call start() manually), then iterate or call read() for chunks.

Constructor

decibri.Microphone(
    sample_rate=16000,
    channels=1,
    frames_per_buffer=1600,
    dtype="int16",
    device=None,
    vad=False,
    vad_threshold=None,
    vad_holdoff_ms=300,
    model_path=None,
    as_ndarray=False,
    ort_library_path=None,
)

Parameter	Type	Default	Description
`sample_rate`	`int`	`16000`	Samples per second (1,000 to 384,000 Hz). 16,000 matches Silero VAD and most cloud STT providers; OpenAI Realtime requires 24,000.
`channels`	`int`	`1`	Number of input channels (1 to 32).
`frames_per_buffer`	`int`	`1600`	Frames per audio callback. 1,600 at 16 kHz is 100 ms chunks (64 to 65,536).
`dtype`	`"int16" \| "float32"`	`"int16"`	Sample encoding format.
`device`	`int \| str \| None`	system default	Device index from `Microphone.input_devices()` or case-insensitive name substring.
`vad`	`False \| "silero" \| "energy"`	`False`	Voice activity detector mode. See Voice activity detection.
`vad_threshold`	`float \| None`	mode default	Threshold in `[0, 1]`. Defaults to 0.5 in `"silero"` mode, 0.01 in `"energy"` mode.
`vad_holdoff_ms`	`int`	`300`	Milliseconds of sub-threshold audio before `is_speaking` flips back to `False`.
`model_path`	`str \| Path \| None`	bundled	Override path to a Silero VAD ONNX model. Only used when `vad="silero"`; defaults to the model bundled with the wheel.
`as_ndarray`	`bool`	`False`	When `True`, `read()` returns a `numpy.ndarray` instead of `bytes`. Requires `pip install decibri[numpy]`.
`ort_library_path`	`str \| Path \| None`	resolver	Override path to the ONNX Runtime dynamic library. Only used when `vad="silero"`. See ONNX Runtime resolution for the four-arm priority order.

Context manager usage

The canonical Python pattern. Entering the with block opens the stream and starts capture; exiting stops the stream and resets VAD state, even if an exception propagates out.

import decibri

with decibri.Microphone(sample_rate=16000, channels=1, frames_per_buffer=1600) as mic:
    for chunk in mic:
        process(chunk)
        if done():
            break

Calling start() manually is also supported when the context manager does not fit. Pair it with stop() in a try / finally.

Methods

`mic.start()`

Open and start the capture stream. Calling start() after stop() or close() is supported and reconstructs the stream cleanly; VAD state resets on each new start(). Calling start() on an already-running instance raises AlreadyRunning.

`mic.stop()`

Stop the capture stream and reset VAD state and the sequence counter. Idempotent; safe to call multiple times.

`mic.close()`

Alias for stop(). Provided for ergonomic parity with the asyncio / aiohttp / httpx convention. The two methods are currently equivalent and are intended to remain interchangeable.

`mic.read(timeout_ms=None)`

Read one chunk. Returns the chunk, or None if the stream closed. Return type is bytes by default, or numpy.ndarray when the Microphone was constructed with as_ndarray=True. Advances VAD state as a side effect when VAD is enabled.

`mic.read_with_metadata(timeout_ms=None)`

Read one chunk and return it as a frozen Chunk with .data, .timestamp, .sequence, .is_speaking, and .vad_score attributes. Returns None on clean stream close. See Value types.

`mic.iter_with_metadata()`

Generator yielding Chunk objects until the stream closes cleanly. Use this in place of for chunk in mic when you want metadata alongside the audio data.

with decibri.Microphone(vad="silero") as mic:
    for chunk in mic.iter_with_metadata():
        if chunk.is_speaking:
            send_to_stt(chunk.data)

`iter(mic)` and `next(mic)`

The Microphone is itself an iterator. for chunk in mic: yields the raw data shape (bytes or numpy.ndarray) and raises StopIteration when the stream closes.

Properties

`mic.is_open`

bool (read-only). Returns True while the capture stream is currently running.

`mic.is_speaking`

bool (read-only). Returns True while VAD considers the user to be speaking, including the holdoff grace period. Always False when vad=False. Holdoff expiry is checked on every property access, so consumers who pause iteration still observe correct state when they next read.

`mic.vad_score`

float in [0, 1], mode-agnostic. In vad="silero" mode this is the raw Silero probability for the most recent chunk; in vad="energy" mode it is the normalised RMS energy. Always 0.0 when vad=False.

Static methods

`Microphone.input_devices()`

Returns a list of DeviceInfo objects describing every input device recognised by the operating system.

for d in decibri.Microphone.input_devices():
    print(d.index, d.name, d.default_sample_rate)

`Microphone.version()`

Returns a VersionInfo object with the Rust core version, the audio backend version, and the binding wheel version.

All Microphone instances support repr() for debugging; the output includes sample rate, channels, dtype, frames per buffer, device, VAD mode, and open state.

Speaker (sync)

Audio output surface. Construct, enter a with block, write samples, and drain.

Constructor

decibri.Speaker(
    sample_rate=16000,
    channels=1,
    dtype="int16",
    device=None,
)

Parameter	Type	Default	Description
`sample_rate`	`int`	`16000`	Output sample rate in Hz (1,000 to 384,000). Use 24,000 for OpenAI Realtime playback.
`channels`	`int`	`1`	Number of output channels (1 to 32). Multi-channel samples are interleaved on the wire.
`dtype`	`"int16" \| "float32"`	`"int16"`	Sample dtype. Must match the data passed to `write()`; mismatch raises `TypeError`.
`device`	`int \| str \| None`	system default	Device index from `Speaker.output_devices()` or case-insensitive name substring.

Context manager usage

import decibri

with decibri.Speaker(sample_rate=16000, channels=1) as spk:
    spk.write(audio_bytes)
    spk.drain()

Methods

`spk.start()`

Open and start the output stream. Re-entry after stop() or close() is supported.

`spk.stop()`

Stop the output stream.

`spk.close()`

Alias for stop(). See Microphone.close() for the equivalence note.

`spk.write(samples)`

Write a chunk to the output stream. Accepts bytes or a numpy.ndarray with dtype matching the configured dtype. Multi-channel ndarrays use shape (N, channels). Output streams duck-type the input on each call rather than committing at construction time; mixing bytes and ndarrays across calls is supported. Raises TypeError on dtype mismatch or unsupported input.

`spk.drain()`

Block until all queued samples have been played. Useful at the end of a playback sequence so the program does not exit before the speaker buffer empties.

Properties

`spk.is_playing`

bool (read-only). Returns True while the output stream is currently running.

Static methods

`Speaker.output_devices()`

Returns a list of OutputDeviceInfo objects describing every output device recognised by the operating system.

AsyncMicrophone

When to use async

Reach for AsyncMicrophone when capture lives on an event loop: a voice agent that streams chunks to a websocket, a coroutine-based pipeline, or anywhere a sibling task may need to cancel the read in flight. The async classes serialise concurrent calls via a Rust-side Tokio mutex, so sibling-task cancellation is safe in a way that the sync Microphone does not provide.

Constructor

Parameters match Microphone exactly. See the Microphone section for the full parameter table.

mic = decibri.AsyncMicrophone(sample_rate=16000, vad="silero")

Note: AsyncMicrophone.version() is synchronous (no await) because it returns compile-time constants. Every other method on the class is a coroutine. Double-awaiting version() raises a confusing TypeError; call it without await.

`await AsyncMicrophone.open(...)`

Async factory classmethod. The synchronous constructor blocks for roughly 100 to 500 milliseconds when vad="silero" because it loads the Silero ONNX model inline. open() dispatches that load to loop.run_in_executor(None, ...) so the event loop keeps spinning while ORT initialises.

mic = await decibri.AsyncMicrophone.open(vad="silero")
async with mic:
    async for chunk in mic:
        await process(chunk)

Note: The canonical pattern is async with await AsyncMicrophone.open(...) as mic:. The open() factory is itself a coroutine, so it must be awaited before async with takes the resulting instance.

`async with` and `async for`

The async context manager opens the stream on entry and stops it on exit. Iteration via async for yields the same data shape as the sync Microphone (bytes by default, numpy.ndarray when as_ndarray=True).

async with await decibri.AsyncMicrophone.open(sample_rate=16000) as mic:
    async for chunk in mic:
        await websocket.send(chunk)

Methods

`await mic.start()`

Open and start the capture stream. Re-entry after stop() or close() is supported and resets VAD state.

`await mic.stop()`

Stop the capture stream and reset VAD state and the sequence counter.

`await mic.close()`

Alias for stop(). See Microphone.close() for the equivalence note.

`await mic.read(timeout_ms=None)`

Read one chunk. Returns the chunk, or None if the stream closed. Same return-type rules as the sync read().

`await mic.read_with_metadata(timeout_ms=None)`

Async parallel of Microphone.read_with_metadata(). Returns a frozen Chunk with metadata, or None on clean close.

`mic.aiter_with_metadata()`

Async-generator function yielding Chunk objects until the stream closes cleanly. Stops when the bridge returns None.

async with await decibri.AsyncMicrophone.open(vad="silero") as mic:
    async for chunk in mic.aiter_with_metadata():
        if chunk.is_speaking:
            await stt.send(chunk.data)

Note: aiter_with_metadata() is an async-generator function. The correct iteration pattern is async for chunk in mic.aiter_with_metadata():; do not await the call itself.

Properties

Properties on AsyncMicrophone are synchronous attribute access (no await). They are backed by lock-free atomic mirrors on the Rust bridge, so they report current truth even when the Rust side closes the stream itself (for example, device disconnect).

`mic.is_open`

bool (read-only). True while the capture stream is running.

`mic.is_speaking`

bool (read-only). Same semantics as the sync property: above-threshold detection plus holdoff. Always False when vad=False.

`mic.vad_score`

float in [0, 1]. Same semantics as the sync property.

Static methods

`await AsyncMicrophone.input_devices()`

Async parallel of Microphone.input_devices(). Returns a list of DeviceInfo.

`AsyncMicrophone.version()`

Synchronous (see the callout above). Returns a VersionInfo for the Rust core, the audio backend, and the binding wheel.

Cancellation semantics

Cancelling an awaited AsyncMicrophone call (via asyncio.CancelledError, asyncio.wait_for, or explicit task.cancel()) raises CancelledError immediately on the Python side. The Rust-side spawn_blocking thread completes on its own schedule and its result is dropped. The bridge state stays consistent for subsequent reads, so sibling-task cancellation while a read() is in flight is safe.

AsyncSpeaker

Asyncio mirror of Speaker. Same parameter set, same lifecycle, all methods coroutines.

Constructor

Parameters match Speaker exactly. See the Speaker section for the parameter table.

spk = decibri.AsyncSpeaker(sample_rate=24000, channels=1)

`await AsyncSpeaker.open(...)`

Async factory classmethod, symmetric with AsyncMicrophone.open. Speaker does not load ORT, so the event-loop blocking risk is smaller, but the factory is provided for API parity.

Note: Unlike AsyncMicrophone, the synchronous AsyncSpeaker(...) constructor does no heavy work, so calling it directly inside an async function is fine. open() is provided for symmetry; reach for it if you prefer the consistent factory pattern across both classes.

`async with` usage

async with decibri.AsyncSpeaker(sample_rate=24000) as spk:
    await spk.write(audio_bytes)
    await spk.drain()

Methods

`await spk.start()`

Open and start the output stream.

`await spk.stop()`

Stop the output stream.

`await spk.close()`

Alias for stop(). See Microphone.close() for the equivalence note.

`await spk.write(samples)`

Async parallel of Speaker.write. Accepts bytes or a numpy.ndarray with matching dtype.

`await spk.drain()`

Block until all queued samples have been played. Cancelling this await raises CancelledError immediately, but the audio continues to play until the output buffer empties on the callback's own schedule. For production code, complete drains before initiating new writes.

Properties

`spk.is_playing`

bool (read-only). Synchronous property backed by a lock-free atomic mirror on the bridge.

Static methods

`await AsyncSpeaker.output_devices()`

Async parallel of Speaker.output_devices().

Module-level helpers

Convenience entry points exposed directly on the decibri module.

`decibri.input_devices()`

Module-level shortcut for Microphone.input_devices(). Returns a list of DeviceInfo.

for d in decibri.input_devices():
    print(d.index, d.name)

`decibri.output_devices()`

Module-level shortcut for Speaker.output_devices(). Returns a list of OutputDeviceInfo.

`decibri.version()`

Returns a VersionInfo for the Rust core, the audio backend, and the binding wheel.

v = decibri.version()
print(v.decibri, v.audio_backend, v.binding)

`decibri.record_to_file(path, duration_seconds, sample_rate=16000, channels=1, device=None)`

Synchronous one-shot recorder. Captures duration_seconds of microphone audio to a 16-bit PCM WAV file. Wraps Microphone plus the standard library wave module. Frame-count termination guarantees an accurate duration even on platforms where the buffer hint is ignored by the OS audio subsystem.

decibri.record_to_file("clip.wav", duration_seconds=5.0)

`await decibri.async_record_to_file(path, duration_seconds, sample_rate=16000, channels=1, device=None)`

Async parallel of record_to_file. Same parameters, same semantics; await it from an asyncio context.

await decibri.async_record_to_file("clip.wav", duration_seconds=5.0)

Value types

Small typed return shapes used across the API.

`Chunk`

Frozen dataclass returned by read_with_metadata() and iter_with_metadata() on both Microphone and AsyncMicrophone.

Property	Type	Description
`data`	`bytes` or `numpy.ndarray`	Audio chunk. Shape matches the `as_ndarray` constructor flag.
`timestamp`	`float`	`time.monotonic()` snapshot at the chunk boundary, in seconds. Useful for relative timing within a session.
`sequence`	`int`	Per-session chunk counter starting at 0. Resets on each new `start()`.
`is_speaking`	`bool`	VAD state snapshot at the chunk boundary. Always `False` when VAD is disabled.
`vad_score`	`float`	VAD score snapshot in `[0, 1]`. Always `0.0` when VAD is disabled.

`DeviceInfo`

Returned by Microphone.input_devices(), decibri.input_devices(), and await AsyncMicrophone.input_devices().

Property	Type	Description
`index`	`int`	Device index, usable as the `device` constructor argument.
`name`	`str`	Human-readable device name reported by the operating system.
`id`	`str`	Stable platform-specific device identifier.
`max_input_channels`	`int`	Maximum number of input channels the device supports.
`default_sample_rate`	`int`	The device's native or preferred sample rate in Hz.
`is_default`	`bool`	Whether this is the current system default input device.

`OutputDeviceInfo`

Returned by Speaker.output_devices(), decibri.output_devices(), and await AsyncSpeaker.output_devices().

Property	Type	Description
`index`	`int`	Device index, usable as the `device` constructor argument.
`name`	`str`	Human-readable device name reported by the operating system.
`id`	`str`	Stable platform-specific device identifier.
`max_output_channels`	`int`	Maximum number of output channels the device supports.
`default_sample_rate`	`int`	The device's native or preferred sample rate in Hz.
`is_default`	`bool`	Whether this is the current system default output device.

`VersionInfo`

Returned by Microphone.version(), AsyncMicrophone.version(), and decibri.version().

Property	Type	Description
`decibri`	`str`	Semver of the underlying Rust core.
`audio_backend`	`str`	Audio backend name and version (for example, `"cpal 0.17"`).
`binding`	`str`	Semver of the Python binding wheel.

Voice activity detection

Decibri ships two VAD modes plus a disabled default. All three are selected with the vad constructor parameter on Microphone and AsyncMicrophone.

Mode	Description	Default threshold	Threshold range
`False`	VAD disabled. `is_speaking` always `False`, `vad_score` always `0.0`.	n/a	n/a
`"energy"`	Lightweight RMS-energy threshold computed in pure Python over each chunk.	0.01	0.0 to 1.0
`"silero"`	ML-based detector using the bundled Silero ONNX model, run through ONNX Runtime.	0.5	0.0 to 1.0

Note: The bundled Silero VAD ONNX model and ONNX Runtime dylib ship inside the wheel. No separate download, no API keys, and no onnxruntime system dependency are required for vad="silero".

Holdoff and the `is_speaking` state machine

Decibri runs a pure-Python state machine on top of the raw VAD probability. Above-threshold chunks set the speaking state and cancel any pending silence timer. Below-threshold chunks while already speaking start a silence timer; the timer expires after vad_holdoff_ms elapsed real time, at which point is_speaking flips back to False. Timer expiry is checked on every property access via time.monotonic(), so consumers who pause iteration still observe correct state on the next read.

`is_speaking` vs `vad_score`

is_speaking is the debounced state machine output: above the threshold plus the holdoff grace. vad_score is the raw per-chunk view, identical to the Silero probability in Silero mode and the normalised RMS in energy mode. Use is_speaking for gating downstream work; use vad_score when you need the underlying signal (for example, to threshold differently per chunk or to log probability distributions).

Custom Silero model with `model_path`

The bundled Silero model is the published Silero v5 checkpoint. To use a different Silero ONNX variant, pass an absolute path on the model_path constructor parameter. The path is only consulted when vad="silero"; energy mode and vad=False ignore it.

mic = decibri.Microphone(vad="silero", model_path="/opt/models/silero_vad_v4.onnx")

Audio format

Decibri supports two on-the-wire sample formats, selected with the dtype constructor parameter. Both apply to Microphone, Speaker, and their async variants.

`int16` (default)

Each two bytes represents one 16-bit signed integer sample, little-endian. Range: -32,768 to 32,767. Two bytes per sample. This is the format expected by most cloud STT providers and the wire format used by the record_to_file helpers.

with decibri.Microphone(dtype="int16") as mic:
    chunk = mic.read()  # bytes; len(chunk) == frames * channels * 2

`float32`

Each four bytes represents one 32-bit IEEE 754 float sample, little-endian. Range: approximately -1.0 to 1.0. Four bytes per sample. Use this when your downstream pipeline expects normalised floats and you would otherwise convert from int16.

with decibri.Microphone(dtype="float32") as mic:
    chunk = mic.read()  # bytes; len(chunk) == frames * channels * 4

Note: On Windows WASAPI, the audio backend often ignores the frames_per_buffer hint and delivers chunks sized to the OS device period instead. Frame-count loops still observe accurate total duration (see record_to_file); chunk-count loops can record more or less than requested. Prefer frame-count termination when an exact duration matters.

NumPy integration

Set as_ndarray=True on the Microphone constructor to receive numpy.ndarray instead of bytes from read(). The array's dtype matches the configured dtype (np.int16 or np.float32); the shape is 1-D (N,) for mono and 2-D (N, channels) for multi-channel (interleaved).

import decibri
import numpy as np

with decibri.Microphone(sample_rate=16000, dtype="float32", as_ndarray=True) as mic:
    chunk = mic.read()
    assert isinstance(chunk, np.ndarray)
    assert chunk.dtype == np.float32

Speaker.write() duck-types on each call: pass bytes or pass an ndarray with matching dtype. Mixing both within a single output session is supported.

The NumPy extra is opt-in to keep the default install lightweight:

$ pip install decibri[numpy]

Note: as_ndarray=True requires the NumPy extra. Reading from a Microphone constructed with as_ndarray=True on an install without the extra raises ImportError with the message numpy is not installed. Install with: pip install decibri[numpy].

Type hints and `py.typed`

The decibri wheel ships a py.typed marker file per PEP 561, with hand-written .pyi stubs covering the internal Rust extension module and the full exception hierarchy. The package is mypy strict-clean. IDEs and type checkers will autocomplete every public name and narrow return types correctly.

import decibri
import numpy as np

mic = decibri.Microphone(as_ndarray=True)
chunk = mic.read()
# Type checker narrows `chunk` to numpy.ndarray | None when as_ndarray=True
# and to bytes | None otherwise.

The package depends on typing-extensions at runtime to support the 3.10 abi3 floor (typing.Self is only available in 3.11 and newer).

Exception handling

Decibri raises typed exceptions instead of generic RuntimeError or Exception. The root of the hierarchy is DecibriError. Three intermediate parents (DeviceError, OrtError, OrtPathError) group related instance classes so callers can catch by category instead of by individual class. Every exception remains catchable as DecibriError.

Hierarchy

Exception class	Parent	Common cause
`SampleRateOutOfRange`	`DecibriError`	Constructor `sample_rate` outside the supported range.
`ChannelsOutOfRange`	`DecibriError`	Constructor `channels` outside the supported range.
`FramesPerBufferOutOfRange`	`DecibriError`	Constructor `frames_per_buffer` outside the supported range.
`InvalidFormat`	`DecibriError`	Constructor `dtype` not `"int16"` or `"float32"`.
`AlreadyRunning`	`DecibriError`	`start()` called on an instance that is already capturing.
`StreamOpenFailed`	`DecibriError`	The audio stream failed to open.
`StreamStartFailed`	`DecibriError`	The audio stream opened but failed to start.
`PermissionDenied`	`DecibriError`	The operating system denied microphone access. Message includes platform-specific guidance.
`CaptureStreamClosed`	`DecibriError`	Read attempted on a closed capture stream (often a mid-stream device disconnect).
`OutputStreamClosed`	`DecibriError`	Write attempted on a closed output stream.
`VadSampleRateUnsupported`	`DecibriError`	VAD enabled with a sample rate the VAD model cannot accept.
`VadThresholdOutOfRange`	`DecibriError`	Constructor `vad_threshold` outside `[0, 1]`.
`ForkAfterOrtInit`	`DecibriError`	Linux only. The current process inherited an ORT session from its parent across `fork()`. See Multiprocessing and asyncio caveats.
`DeviceNotFound`	`DeviceError`	The named input device does not match any device on the system.
`OutputDeviceNotFound`	`DeviceError`	The named output device does not match any device on the system.
`MultipleDevicesMatch`	`DeviceError`	The device name substring matches more than one device; use a more specific substring or the integer index.
`DeviceIndexOutOfRange`	`DeviceError`	The integer device index is out of range for the host audio API.
`NoMicrophoneFound`	`DeviceError`	The system reports zero input devices.
`NoOutputDeviceFound`	`DeviceError`	The system reports zero output devices.
`NotAnInputDevice`	`DeviceError`	The matched device exists but is not capable of input.
`DeviceEnumerationFailed`	`DeviceError`	The audio backend failed to enumerate devices.
`OrtInitFailed`	`OrtError`	ONNX Runtime initialisation itself failed (no specific path was supplied).
`OrtSessionBuildFailed`	`OrtError`	Building an ORT inference session failed.
`OrtThreadsConfigFailed`	`OrtError`	Configuring ORT thread pools failed.
`VadModelLoadFailed`	`OrtError`	Loading the Silero VAD ONNX model failed. Has a `.path` attribute.
`OrtInferenceFailed`	`OrtError`	ORT inference produced an error at runtime.
`OrtTensorCreateFailed`	`OrtError`	Creating an ORT input tensor failed.
`OrtTensorExtractFailed`	`OrtError`	Extracting values from an ORT output tensor failed.
`OrtLoadFailed`	`OrtPathError`	The supplied ORT dylib path passed the filesystem pre-check but ORT rejected it. Has a `.path` attribute.
`OrtPathInvalid`	`OrtPathError`	The supplied ORT dylib path failed the pre-check before ORT saw it. Has `.path` and `.reason` attributes.

Exceptions with extra attributes

Three classes carry additional attributes beyond the standard exception message:

VadModelLoadFailed.path: the model path that failed to load.
OrtLoadFailed.path: the ORT dylib path that failed to load.
OrtPathInvalid.path, OrtPathInvalid.reason: the rejected path and a short reason string.

Catch patterns

Catch any decibri error:

try:
    with decibri.Microphone(sample_rate=16000) as mic:
        chunk = mic.read()
except decibri.DecibriError as e:
    print(f"Decibri error: {e}")

Catch device-selection failures specifically, then fall back to the system default:

try:
    mic = decibri.Microphone(device="USB Audio")
except decibri.DeviceError as e:
    print(f"Device problem: {e}")
    mic = decibri.Microphone()

Multiprocessing and asyncio caveats

Linux fork-safety with Silero

Python's default fork start method on Linux duplicates the parent's memory into the child, but ONNX Runtime's internal state is not safe to share across forked processes. A Silero-enabled Microphone initialised in the parent and then used in a forked child either produces incorrect inference results or segfaults; decibri detects the PID mismatch at the start of every Silero inference call and raises ForkAfterOrtInit instead.

The fix is to either set the spawn start method before constructing any worker, or to construct the Microphone inside each child process after the fork.

import multiprocessing

if __name__ == "__main__":
    multiprocessing.set_start_method("spawn")
    # ... rest of program

Note: On Linux, when combining vad="silero" with multiprocessing, call multiprocessing.set_start_method("spawn") before spawning workers. The default fork start method shares ORT state across processes unsafely; decibri detects the mismatch and raises ForkAfterOrtInit. macOS already defaults to spawn; Windows always uses spawn. The setting only matters on Linux.

Async event-loop blocking on ORT init

The synchronous AsyncMicrophone(...) constructor blocks for roughly 100 to 500 milliseconds when vad="silero" because it loads the Silero ONNX model inline. In an async context this blocks the event loop for the duration of the load, which can cause dropped websocket frames, late timer callbacks, and UI jitter in voice agents. Use await AsyncMicrophone.open(...) instead; the factory dispatches the synchronous construction to loop.run_in_executor(None, ...) so the event loop keeps running.

For more multiprocessing recipes, see the Python multiprocessing guide in the repository.

ONNX Runtime resolution

When vad="silero" is requested, decibri needs to load the ONNX Runtime dynamic library. The path is resolved in this order, first match wins:

The ort_library_path constructor argument, if supplied.
The DECIBRI_ORT_DYLIB_PATH environment variable, for per-deployment overrides without code changes.
The ORT_DYLIB_PATH environment variable, the upstream ort crate's standard convention, respected so existing bare-ort deployments keep working.
The dylib bundled inside the wheel under decibri/_ort/. This is the default pip install decibri experience.

If none of the above resolve to a real file, ORT's default loader runs. That loader itself respects ORT_DYLIB_PATH if set after decibri import, so a late environment change still works as a last-resort fallback.

When ORT is loaded

Only when vad="silero". The other VAD modes ("energy", False) never touch ORT, so the resolver and bundled-dylib lookup are skipped entirely.

First-load cost

The first Microphone constructed with vad="silero" incurs roughly 100 to 500 milliseconds of cold load on most platforms. The cost is amortised across the rest of the process: subsequent Microphones (sync or async) reuse the same loaded ORT.

Process-global initialisation

The first Microphone that loads Silero determines the dylib for the whole process. Subsequent Microphone constructions inherit that initialisation regardless of their own ort_library_path argument. To switch dylibs, restart the process.

Python API

Quickstart

Install

Record to a file in one line

Stream chunks with a with block

Async parallel

Compatibility

Microphone (sync)

Constructor

Context manager usage

Methods

mic.start()

mic.stop()

mic.close()

mic.read(timeout_ms=None)

mic.read_with_metadata(timeout_ms=None)

mic.iter_with_metadata()

iter(mic) and next(mic)

Properties

mic.is_open

mic.is_speaking

mic.vad_score

Static methods

Microphone.input_devices()

Microphone.version()

Speaker (sync)

Constructor

Context manager usage

Methods

spk.start()

spk.stop()

spk.close()

spk.write(samples)

spk.drain()

Properties

spk.is_playing

Static methods

Speaker.output_devices()

AsyncMicrophone

When to use async

Constructor

await AsyncMicrophone.open(...)

async with and async for

Methods

await mic.start()

await mic.stop()

await mic.close()

await mic.read(timeout_ms=None)

await mic.read_with_metadata(timeout_ms=None)

mic.aiter_with_metadata()

Properties

mic.is_open

mic.is_speaking

mic.vad_score

Static methods

await AsyncMicrophone.input_devices()

AsyncMicrophone.version()

Cancellation semantics

AsyncSpeaker

Constructor

await AsyncSpeaker.open(...)

async with usage

Methods

await spk.start()

await spk.stop()

await spk.close()

await spk.write(samples)

await spk.drain()

Properties

spk.is_playing

Static methods

await AsyncSpeaker.output_devices()

Module-level helpers

decibri.input_devices()

decibri.output_devices()

decibri.version()

decibri.record_to_file(path, duration_seconds, sample_rate=16000, channels=1, device=None)

await decibri.async_record_to_file(path, duration_seconds, sample_rate=16000, channels=1, device=None)

Value types

Chunk

Stream chunks with a `with` block

`mic.start()`

`mic.stop()`

`mic.close()`

`mic.read(timeout_ms=None)`

`mic.read_with_metadata(timeout_ms=None)`

`mic.iter_with_metadata()`

`iter(mic)` and `next(mic)`

`mic.is_open`

`mic.is_speaking`

`mic.vad_score`

`Microphone.input_devices()`

`Microphone.version()`

`spk.start()`

`spk.stop()`

`spk.close()`

`spk.write(samples)`

`spk.drain()`

`spk.is_playing`

`Speaker.output_devices()`

`await AsyncMicrophone.open(...)`

`async with` and `async for`

`await mic.start()`

`await mic.stop()`

`await mic.close()`

`await mic.read(timeout_ms=None)`

`await mic.read_with_metadata(timeout_ms=None)`

`mic.aiter_with_metadata()`

`mic.is_open`

`mic.is_speaking`

`mic.vad_score`

`await AsyncMicrophone.input_devices()`

`AsyncMicrophone.version()`

`await AsyncSpeaker.open(...)`

`async with` usage

`await spk.start()`

`await spk.stop()`

`await spk.close()`

`await spk.write(samples)`

`await spk.drain()`

`spk.is_playing`

`await AsyncSpeaker.output_devices()`

`decibri.input_devices()`

`decibri.output_devices()`

`decibri.version()`

`decibri.record_to_file(path, duration_seconds, sample_rate=16000, channels=1, device=None)`

`await decibri.async_record_to_file(path, duration_seconds, sample_rate=16000, channels=1, device=None)`

`Chunk`

`DeviceInfo`

`OutputDeviceInfo`

`VersionInfo`

Holdoff and the `is_speaking` state machine

`is_speaking` vs `vad_score`

Custom Silero model with `model_path`

`int16` (default)

`float32`

Type hints and `py.typed`