Decibri ships a native Python package with synchronous Microphone and Speaker classes plus matching AsyncMicrophone and AsyncSpeaker classes for asyncio. Written in Rust (via PyO3 / abi3) with pre-built wheels for Python 3.10 and newer. For installation and first capture, see Getting started.
Three ways to get audio into and out of Python, in order of increasing control.
Recommended with uv:
Or with pip:
For NumPy ndarray support (see Audio format):
import decibri
decibri.record_to_file("output.wav", duration_seconds=10)
Captures 10 seconds of microphone audio to a 16-bit PCM WAV file at 16 kHz mono. No async, no streaming, no setup.
with blockimport decibri
with decibri.Microphone(sample_rate=16000) as mic:
for chunk in mic:
print(f"Got {len(chunk)} bytes")
break
Open the system microphone, iterate raw 16-bit PCM chunks, break after the first. Replace break with your processing pipeline.
import asyncio
import decibri
async def main():
async with await decibri.AsyncMicrophone.open(sample_rate=16000) as mic:
async for chunk in mic:
print(f"Got {len(chunk)} bytes")
break
asyncio.run(main())
Same loop, but on the event loop. Use this in voice agents or websocket pipelines.
| Python versions | Platforms |
|---|---|
| 3.10, 3.11, 3.12, 3.13, 3.14 | Linux x64, Linux ARM64, macOS Apple Silicon, Windows x64 |
Pre-built wheels are published for every supported platform. pip install decibri fetches a binary wheel; no Rust toolchain, no C compiler, and no system audio headers are required at install time.
Wheels are built against the CPython stable ABI (abi3) with a 3.10 floor, so a single wheel per platform serves every supported interpreter version. New CPython releases work without a new decibri release as long as the stable ABI is preserved.
Primary capture surface for synchronous code. Construct an instance, enter a with block (or call start() manually), then iterate or call read() for chunks.
decibri.Microphone(
sample_rate=16000,
channels=1,
frames_per_buffer=1600,
dtype="int16",
device=None,
vad=False,
vad_threshold=None,
vad_holdoff_ms=300,
model_path=None,
as_ndarray=False,
ort_library_path=None,
)
| Parameter | Type | Default | Description |
|---|---|---|---|
sample_rate |
int |
16000 |
Samples per second (1,000 to 384,000 Hz). 16,000 matches Silero VAD and most cloud STT providers; OpenAI Realtime requires 24,000. |
channels |
int |
1 |
Number of input channels (1 to 32). |
frames_per_buffer |
int |
1600 |
Frames per audio callback. 1,600 at 16 kHz is 100 ms chunks (64 to 65,536). |
dtype |
"int16" | "float32" |
"int16" |
Sample encoding format. |
device |
int | str | None |
system default | Device index from Microphone.input_devices() or case-insensitive name substring. |
vad |
False | "silero" | "energy" |
False |
Voice activity detector mode. See Voice activity detection. |
vad_threshold |
float | None |
mode default | Threshold in [0, 1]. Defaults to 0.5 in "silero" mode, 0.01 in "energy" mode. |
vad_holdoff_ms |
int |
300 |
Milliseconds of sub-threshold audio before is_speaking flips back to False. |
model_path |
str | Path | None |
bundled | Override path to a Silero VAD ONNX model. Only used when vad="silero"; defaults to the model bundled with the wheel. |
as_ndarray |
bool |
False |
When True, read() returns a numpy.ndarray instead of bytes. Requires pip install decibri[numpy]. |
ort_library_path |
str | Path | None |
resolver | Override path to the ONNX Runtime dynamic library. Only used when vad="silero". See ONNX Runtime resolution for the four-arm priority order. |
The canonical Python pattern. Entering the with block opens the stream and starts capture; exiting stops the stream and resets VAD state, even if an exception propagates out.
import decibri
with decibri.Microphone(sample_rate=16000, channels=1, frames_per_buffer=1600) as mic:
for chunk in mic:
process(chunk)
if done():
break
Calling start() manually is also supported when the context manager does not fit. Pair it with stop() in a try / finally.
mic.start()Open and start the capture stream. Calling start() after stop() or close() is supported and reconstructs the stream cleanly; VAD state resets on each new start(). Calling start() on an already-running instance raises AlreadyRunning.
mic.stop()Stop the capture stream and reset VAD state and the sequence counter. Idempotent; safe to call multiple times.
mic.close()Alias for stop(). Provided for ergonomic parity with the asyncio / aiohttp / httpx convention. The two methods are currently equivalent and are intended to remain interchangeable.
mic.read(timeout_ms=None)Read one chunk. Returns the chunk, or None if the stream closed. Return type is bytes by default, or numpy.ndarray when the Microphone was constructed with as_ndarray=True. Advances VAD state as a side effect when VAD is enabled.
mic.read_with_metadata(timeout_ms=None)Read one chunk and return it as a frozen Chunk with .data, .timestamp, .sequence, .is_speaking, and .vad_score attributes. Returns None on clean stream close. See Value types.
mic.iter_with_metadata()Generator yielding Chunk objects until the stream closes cleanly. Use this in place of for chunk in mic when you want metadata alongside the audio data.
with decibri.Microphone(vad="silero") as mic:
for chunk in mic.iter_with_metadata():
if chunk.is_speaking:
send_to_stt(chunk.data)
iter(mic) and next(mic)The Microphone is itself an iterator. for chunk in mic: yields the raw data shape (bytes or numpy.ndarray) and raises StopIteration when the stream closes.
mic.is_openbool (read-only). Returns True while the capture stream is currently running.
mic.is_speakingbool (read-only). Returns True while VAD considers the user to be speaking, including the holdoff grace period. Always False when vad=False. Holdoff expiry is checked on every property access, so consumers who pause iteration still observe correct state when they next read.
mic.vad_scorefloat in [0, 1], mode-agnostic. In vad="silero" mode this is the raw Silero probability for the most recent chunk; in vad="energy" mode it is the normalised RMS energy. Always 0.0 when vad=False.
Microphone.input_devices()Returns a list of DeviceInfo objects describing every input device recognised by the operating system.
for d in decibri.Microphone.input_devices():
print(d.index, d.name, d.default_sample_rate)
Microphone.version()Returns a VersionInfo object with the Rust core version, the audio backend version, and the binding wheel version.
All Microphone instances support repr() for debugging; the output includes sample rate, channels, dtype, frames per buffer, device, VAD mode, and open state.
Audio output surface. Construct, enter a with block, write samples, and drain.
decibri.Speaker(
sample_rate=16000,
channels=1,
dtype="int16",
device=None,
)
| Parameter | Type | Default | Description |
|---|---|---|---|
sample_rate |
int |
16000 |
Output sample rate in Hz (1,000 to 384,000). Use 24,000 for OpenAI Realtime playback. |
channels |
int |
1 |
Number of output channels (1 to 32). Multi-channel samples are interleaved on the wire. |
dtype |
"int16" | "float32" |
"int16" |
Sample dtype. Must match the data passed to write(); mismatch raises TypeError. |
device |
int | str | None |
system default | Device index from Speaker.output_devices() or case-insensitive name substring. |
import decibri
with decibri.Speaker(sample_rate=16000, channels=1) as spk:
spk.write(audio_bytes)
spk.drain()
spk.start()Open and start the output stream. Re-entry after stop() or close() is supported.
spk.stop()Stop the output stream.
spk.close()Alias for stop(). See Microphone.close() for the equivalence note.
spk.write(samples)Write a chunk to the output stream. Accepts bytes or a numpy.ndarray with dtype matching the configured dtype. Multi-channel ndarrays use shape (N, channels). Output streams duck-type the input on each call rather than committing at construction time; mixing bytes and ndarrays across calls is supported. Raises TypeError on dtype mismatch or unsupported input.
spk.drain()Block until all queued samples have been played. Useful at the end of a playback sequence so the program does not exit before the speaker buffer empties.
spk.is_playingbool (read-only). Returns True while the output stream is currently running.
Speaker.output_devices()Returns a list of OutputDeviceInfo objects describing every output device recognised by the operating system.
Reach for AsyncMicrophone when capture lives on an event loop: a voice agent that streams chunks to a websocket, a coroutine-based pipeline, or anywhere a sibling task may need to cancel the read in flight. The async classes serialise concurrent calls via a Rust-side Tokio mutex, so sibling-task cancellation is safe in a way that the sync Microphone does not provide.
Parameters match Microphone exactly. See the Microphone section for the full parameter table.
mic = decibri.AsyncMicrophone(sample_rate=16000, vad="silero")
AsyncMicrophone.version() is synchronous (no await) because it returns compile-time constants. Every other method on the class is a coroutine. Double-awaiting version() raises a confusing TypeError; call it without await.
await AsyncMicrophone.open(...)Async factory classmethod. The synchronous constructor blocks for roughly 100 to 500 milliseconds when vad="silero" because it loads the Silero ONNX model inline. open() dispatches that load to loop.run_in_executor(None, ...) so the event loop keeps spinning while ORT initialises.
mic = await decibri.AsyncMicrophone.open(vad="silero")
async with mic:
async for chunk in mic:
await process(chunk)
async with await AsyncMicrophone.open(...) as mic:. The open() factory is itself a coroutine, so it must be awaited before async with takes the resulting instance.
async with and async forThe async context manager opens the stream on entry and stops it on exit. Iteration via async for yields the same data shape as the sync Microphone (bytes by default, numpy.ndarray when as_ndarray=True).
async with await decibri.AsyncMicrophone.open(sample_rate=16000) as mic:
async for chunk in mic:
await websocket.send(chunk)
await mic.start()Open and start the capture stream. Re-entry after stop() or close() is supported and resets VAD state.
await mic.stop()Stop the capture stream and reset VAD state and the sequence counter.
await mic.close()Alias for stop(). See Microphone.close() for the equivalence note.
await mic.read(timeout_ms=None)Read one chunk. Returns the chunk, or None if the stream closed. Same return-type rules as the sync read().
await mic.read_with_metadata(timeout_ms=None)Async parallel of Microphone.read_with_metadata(). Returns a frozen Chunk with metadata, or None on clean close.
mic.aiter_with_metadata()Async-generator function yielding Chunk objects until the stream closes cleanly. Stops when the bridge returns None.
async with await decibri.AsyncMicrophone.open(vad="silero") as mic:
async for chunk in mic.aiter_with_metadata():
if chunk.is_speaking:
await stt.send(chunk.data)
aiter_with_metadata() is an async-generator function. The correct iteration pattern is async for chunk in mic.aiter_with_metadata():; do not await the call itself.
Properties on AsyncMicrophone are synchronous attribute access (no await). They are backed by lock-free atomic mirrors on the Rust bridge, so they report current truth even when the Rust side closes the stream itself (for example, device disconnect).
mic.is_openbool (read-only). True while the capture stream is running.
mic.is_speakingbool (read-only). Same semantics as the sync property: above-threshold detection plus holdoff. Always False when vad=False.
mic.vad_scorefloat in [0, 1]. Same semantics as the sync property.
await AsyncMicrophone.input_devices()Async parallel of Microphone.input_devices(). Returns a list of DeviceInfo.
AsyncMicrophone.version()Synchronous (see the callout above). Returns a VersionInfo for the Rust core, the audio backend, and the binding wheel.
Cancelling an awaited AsyncMicrophone call (via asyncio.CancelledError, asyncio.wait_for, or explicit task.cancel()) raises CancelledError immediately on the Python side. The Rust-side spawn_blocking thread completes on its own schedule and its result is dropped. The bridge state stays consistent for subsequent reads, so sibling-task cancellation while a read() is in flight is safe.
Asyncio mirror of Speaker. Same parameter set, same lifecycle, all methods coroutines.
Parameters match Speaker exactly. See the Speaker section for the parameter table.
spk = decibri.AsyncSpeaker(sample_rate=24000, channels=1)
await AsyncSpeaker.open(...)Async factory classmethod, symmetric with AsyncMicrophone.open. Speaker does not load ORT, so the event-loop blocking risk is smaller, but the factory is provided for API parity.
AsyncMicrophone, the synchronous AsyncSpeaker(...) constructor does no heavy work, so calling it directly inside an async function is fine. open() is provided for symmetry; reach for it if you prefer the consistent factory pattern across both classes.
async with usageasync with decibri.AsyncSpeaker(sample_rate=24000) as spk:
await spk.write(audio_bytes)
await spk.drain()
await spk.start()Open and start the output stream.
await spk.stop()Stop the output stream.
await spk.close()Alias for stop(). See Microphone.close() for the equivalence note.
await spk.write(samples)Async parallel of Speaker.write. Accepts bytes or a numpy.ndarray with matching dtype.
await spk.drain()Block until all queued samples have been played. Cancelling this await raises CancelledError immediately, but the audio continues to play until the output buffer empties on the callback's own schedule. For production code, complete drains before initiating new writes.
spk.is_playingbool (read-only). Synchronous property backed by a lock-free atomic mirror on the bridge.
await AsyncSpeaker.output_devices()Async parallel of Speaker.output_devices().
Convenience entry points exposed directly on the decibri module.
decibri.input_devices()Module-level shortcut for Microphone.input_devices(). Returns a list of DeviceInfo.
for d in decibri.input_devices():
print(d.index, d.name)
decibri.output_devices()Module-level shortcut for Speaker.output_devices(). Returns a list of OutputDeviceInfo.
decibri.version()Returns a VersionInfo for the Rust core, the audio backend, and the binding wheel.
v = decibri.version()
print(v.decibri, v.audio_backend, v.binding)
decibri.record_to_file(path, duration_seconds, sample_rate=16000, channels=1, device=None)Synchronous one-shot recorder. Captures duration_seconds of microphone audio to a 16-bit PCM WAV file. Wraps Microphone plus the standard library wave module. Frame-count termination guarantees an accurate duration even on platforms where the buffer hint is ignored by the OS audio subsystem.
decibri.record_to_file("clip.wav", duration_seconds=5.0)
await decibri.async_record_to_file(path, duration_seconds, sample_rate=16000, channels=1, device=None)Async parallel of record_to_file. Same parameters, same semantics; await it from an asyncio context.
await decibri.async_record_to_file("clip.wav", duration_seconds=5.0)
Small typed return shapes used across the API.
ChunkFrozen dataclass returned by read_with_metadata() and iter_with_metadata() on both Microphone and AsyncMicrophone.
| Property | Type | Description |
|---|---|---|
data |
bytes or numpy.ndarray |
Audio chunk. Shape matches the as_ndarray constructor flag. |
timestamp |
float |
time.monotonic() snapshot at the chunk boundary, in seconds. Useful for relative timing within a session. |
sequence |
int |
Per-session chunk counter starting at 0. Resets on each new start(). |
is_speaking |
bool |
VAD state snapshot at the chunk boundary. Always False when VAD is disabled. |
vad_score |
float |
VAD score snapshot in [0, 1]. Always 0.0 when VAD is disabled. |
DeviceInfoReturned by Microphone.input_devices(), decibri.input_devices(), and await AsyncMicrophone.input_devices().
| Property | Type | Description |
|---|---|---|
index |
int |
Device index, usable as the device constructor argument. |
name |
str |
Human-readable device name reported by the operating system. |
id |
str |
Stable platform-specific device identifier. |
max_input_channels |
int |
Maximum number of input channels the device supports. |
default_sample_rate |
int |
The device's native or preferred sample rate in Hz. |
is_default |
bool |
Whether this is the current system default input device. |
OutputDeviceInfoReturned by Speaker.output_devices(), decibri.output_devices(), and await AsyncSpeaker.output_devices().
| Property | Type | Description |
|---|---|---|
index |
int |
Device index, usable as the device constructor argument. |
name |
str |
Human-readable device name reported by the operating system. |
id |
str |
Stable platform-specific device identifier. |
max_output_channels |
int |
Maximum number of output channels the device supports. |
default_sample_rate |
int |
The device's native or preferred sample rate in Hz. |
is_default |
bool |
Whether this is the current system default output device. |
VersionInfoReturned by Microphone.version(), AsyncMicrophone.version(), and decibri.version().
| Property | Type | Description |
|---|---|---|
decibri |
str |
Semver of the underlying Rust core. |
audio_backend |
str |
Audio backend name and version (for example, "cpal 0.17"). |
binding |
str |
Semver of the Python binding wheel. |
Decibri ships two VAD modes plus a disabled default. All three are selected with the vad constructor parameter on Microphone and AsyncMicrophone.
| Mode | Description | Default threshold | Threshold range |
|---|---|---|---|
False |
VAD disabled. is_speaking always False, vad_score always 0.0. |
n/a | n/a |
"energy" |
Lightweight RMS-energy threshold computed in pure Python over each chunk. | 0.01 | 0.0 to 1.0 |
"silero" |
ML-based detector using the bundled Silero ONNX model, run through ONNX Runtime. | 0.5 | 0.0 to 1.0 |
onnxruntime system dependency are required for vad="silero".
is_speaking state machineDecibri runs a pure-Python state machine on top of the raw VAD probability. Above-threshold chunks set the speaking state and cancel any pending silence timer. Below-threshold chunks while already speaking start a silence timer; the timer expires after vad_holdoff_ms elapsed real time, at which point is_speaking flips back to False. Timer expiry is checked on every property access via time.monotonic(), so consumers who pause iteration still observe correct state on the next read.
is_speaking vs vad_scoreis_speaking is the debounced state machine output: above the threshold plus the holdoff grace. vad_score is the raw per-chunk view, identical to the Silero probability in Silero mode and the normalised RMS in energy mode. Use is_speaking for gating downstream work; use vad_score when you need the underlying signal (for example, to threshold differently per chunk or to log probability distributions).
model_pathThe bundled Silero model is the published Silero v5 checkpoint. To use a different Silero ONNX variant, pass an absolute path on the model_path constructor parameter. The path is only consulted when vad="silero"; energy mode and vad=False ignore it.
mic = decibri.Microphone(vad="silero", model_path="/opt/models/silero_vad_v4.onnx")
Decibri supports two on-the-wire sample formats, selected with the dtype constructor parameter. Both apply to Microphone, Speaker, and their async variants.
int16 (default)Each two bytes represents one 16-bit signed integer sample, little-endian. Range: -32,768 to 32,767. Two bytes per sample. This is the format expected by most cloud STT providers and the wire format used by the record_to_file helpers.
with decibri.Microphone(dtype="int16") as mic:
chunk = mic.read() # bytes; len(chunk) == frames * channels * 2
float32Each four bytes represents one 32-bit IEEE 754 float sample, little-endian. Range: approximately -1.0 to 1.0. Four bytes per sample. Use this when your downstream pipeline expects normalised floats and you would otherwise convert from int16.
with decibri.Microphone(dtype="float32") as mic:
chunk = mic.read() # bytes; len(chunk) == frames * channels * 4
frames_per_buffer hint and delivers chunks sized to the OS device period instead. Frame-count loops still observe accurate total duration (see record_to_file); chunk-count loops can record more or less than requested. Prefer frame-count termination when an exact duration matters.
Set as_ndarray=True on the Microphone constructor to receive numpy.ndarray instead of bytes from read(). The array's dtype matches the configured dtype (np.int16 or np.float32); the shape is 1-D (N,) for mono and 2-D (N, channels) for multi-channel (interleaved).
import decibri
import numpy as np
with decibri.Microphone(sample_rate=16000, dtype="float32", as_ndarray=True) as mic:
chunk = mic.read()
assert isinstance(chunk, np.ndarray)
assert chunk.dtype == np.float32
Speaker.write() duck-types on each call: pass bytes or pass an ndarray with matching dtype. Mixing both within a single output session is supported.
The NumPy extra is opt-in to keep the default install lightweight:
as_ndarray=True requires the NumPy extra. Reading from a Microphone constructed with as_ndarray=True on an install without the extra raises ImportError with the message numpy is not installed. Install with: pip install decibri[numpy].
py.typedThe decibri wheel ships a py.typed marker file per PEP 561, with hand-written .pyi stubs covering the internal Rust extension module and the full exception hierarchy. The package is mypy strict-clean. IDEs and type checkers will autocomplete every public name and narrow return types correctly.
import decibri
import numpy as np
mic = decibri.Microphone(as_ndarray=True)
chunk = mic.read()
# Type checker narrows `chunk` to numpy.ndarray | None when as_ndarray=True
# and to bytes | None otherwise.
The package depends on typing-extensions at runtime to support the 3.10 abi3 floor (typing.Self is only available in 3.11 and newer).
Decibri raises typed exceptions instead of generic RuntimeError or Exception. The root of the hierarchy is DecibriError. Three intermediate parents (DeviceError, OrtError, OrtPathError) group related instance classes so callers can catch by category instead of by individual class. Every exception remains catchable as DecibriError.
| Exception class | Parent | Common cause |
|---|---|---|
SampleRateOutOfRange |
DecibriError |
Constructor sample_rate outside the supported range. |
ChannelsOutOfRange |
DecibriError |
Constructor channels outside the supported range. |
FramesPerBufferOutOfRange |
DecibriError |
Constructor frames_per_buffer outside the supported range. |
InvalidFormat |
DecibriError |
Constructor dtype not "int16" or "float32". |
AlreadyRunning |
DecibriError |
start() called on an instance that is already capturing. |
StreamOpenFailed |
DecibriError |
The audio stream failed to open. |
StreamStartFailed |
DecibriError |
The audio stream opened but failed to start. |
PermissionDenied |
DecibriError |
The operating system denied microphone access. Message includes platform-specific guidance. |
CaptureStreamClosed |
DecibriError |
Read attempted on a closed capture stream (often a mid-stream device disconnect). |
OutputStreamClosed |
DecibriError |
Write attempted on a closed output stream. |
VadSampleRateUnsupported |
DecibriError |
VAD enabled with a sample rate the VAD model cannot accept. |
VadThresholdOutOfRange |
DecibriError |
Constructor vad_threshold outside [0, 1]. |
ForkAfterOrtInit |
DecibriError |
Linux only. The current process inherited an ORT session from its parent across fork(). See Multiprocessing and asyncio caveats. |
DeviceNotFound |
DeviceError |
The named input device does not match any device on the system. |
OutputDeviceNotFound |
DeviceError |
The named output device does not match any device on the system. |
MultipleDevicesMatch |
DeviceError |
The device name substring matches more than one device; use a more specific substring or the integer index. |
DeviceIndexOutOfRange |
DeviceError |
The integer device index is out of range for the host audio API. |
NoMicrophoneFound |
DeviceError |
The system reports zero input devices. |
NoOutputDeviceFound |
DeviceError |
The system reports zero output devices. |
NotAnInputDevice |
DeviceError |
The matched device exists but is not capable of input. |
DeviceEnumerationFailed |
DeviceError |
The audio backend failed to enumerate devices. |
OrtInitFailed |
OrtError |
ONNX Runtime initialisation itself failed (no specific path was supplied). |
OrtSessionBuildFailed |
OrtError |
Building an ORT inference session failed. |
OrtThreadsConfigFailed |
OrtError |
Configuring ORT thread pools failed. |
VadModelLoadFailed |
OrtError |
Loading the Silero VAD ONNX model failed. Has a .path attribute. |
OrtInferenceFailed |
OrtError |
ORT inference produced an error at runtime. |
OrtTensorCreateFailed |
OrtError |
Creating an ORT input tensor failed. |
OrtTensorExtractFailed |
OrtError |
Extracting values from an ORT output tensor failed. |
OrtLoadFailed |
OrtPathError |
The supplied ORT dylib path passed the filesystem pre-check but ORT rejected it. Has a .path attribute. |
OrtPathInvalid |
OrtPathError |
The supplied ORT dylib path failed the pre-check before ORT saw it. Has .path and .reason attributes. |
Three classes carry additional attributes beyond the standard exception message:
VadModelLoadFailed.path: the model path that failed to load.OrtLoadFailed.path: the ORT dylib path that failed to load.OrtPathInvalid.path, OrtPathInvalid.reason: the rejected path and a short reason string.Catch any decibri error:
try:
with decibri.Microphone(sample_rate=16000) as mic:
chunk = mic.read()
except decibri.DecibriError as e:
print(f"Decibri error: {e}")
Catch device-selection failures specifically, then fall back to the system default:
try:
mic = decibri.Microphone(device="USB Audio")
except decibri.DeviceError as e:
print(f"Device problem: {e}")
mic = decibri.Microphone()
Python's default fork start method on Linux duplicates the parent's memory into the child, but ONNX Runtime's internal state is not safe to share across forked processes. A Silero-enabled Microphone initialised in the parent and then used in a forked child either produces incorrect inference results or segfaults; decibri detects the PID mismatch at the start of every Silero inference call and raises ForkAfterOrtInit instead.
The fix is to either set the spawn start method before constructing any worker, or to construct the Microphone inside each child process after the fork.
import multiprocessing
if __name__ == "__main__":
multiprocessing.set_start_method("spawn")
# ... rest of program
vad="silero" with multiprocessing, call multiprocessing.set_start_method("spawn") before spawning workers. The default fork start method shares ORT state across processes unsafely; decibri detects the mismatch and raises ForkAfterOrtInit. macOS already defaults to spawn; Windows always uses spawn. The setting only matters on Linux.
The synchronous AsyncMicrophone(...) constructor blocks for roughly 100 to 500 milliseconds when vad="silero" because it loads the Silero ONNX model inline. In an async context this blocks the event loop for the duration of the load, which can cause dropped websocket frames, late timer callbacks, and UI jitter in voice agents. Use await AsyncMicrophone.open(...) instead; the factory dispatches the synchronous construction to loop.run_in_executor(None, ...) so the event loop keeps running.
For more multiprocessing recipes, see the Python multiprocessing guide in the repository.
When vad="silero" is requested, decibri needs to load the ONNX Runtime dynamic library. The path is resolved in this order, first match wins:
ort_library_path constructor argument, if supplied.DECIBRI_ORT_DYLIB_PATH environment variable, for per-deployment overrides without code changes.ORT_DYLIB_PATH environment variable, the upstream ort crate's standard convention, respected so existing bare-ort deployments keep working.decibri/_ort/. This is the default pip install decibri experience.If none of the above resolve to a real file, ORT's default loader runs. That loader itself respects ORT_DYLIB_PATH if set after decibri import, so a late environment change still works as a last-resort fallback.
Only when vad="silero". The other VAD modes ("energy", False) never touch ORT, so the resolver and bundled-dylib lookup are skipped entirely.
The first Microphone constructed with vad="silero" incurs roughly 100 to 500 milliseconds of cold load on most platforms. The cost is amortised across the rest of the process: subsequent Microphones (sync or async) reuse the same loaded ORT.
The first Microphone that loads Silero determines the dylib for the whole process. Subsequent Microphone constructions inherit that initialisation regardless of their own ort_library_path argument. To switch dylibs, restart the process.