Python API

Decibri ships a native Python package with synchronous Microphone and Speaker classes plus matching AsyncMicrophone and AsyncSpeaker classes for asyncio. Written in Rust (via PyO3 / abi3) with pre-built wheels for Python 3.10 and newer. For installation and first capture, see Getting started.

Quickstart

Three ways to get audio into and out of Python, in order of increasing control.

Install

Recommended with uv:

$ uv pip install decibri

Or with pip:

$ pip install decibri

For NumPy ndarray support (see Audio format):

$ pip install decibri[numpy]

Record to a file in one line

import decibri

decibri.record_to_file("output.wav", duration_seconds=10)

Captures 10 seconds of microphone audio to a 16-bit PCM WAV file at 16 kHz mono. No async, no streaming, no setup.

Stream chunks with a with block

import decibri

with decibri.Microphone(sample_rate=16000) as mic:
    for chunk in mic:
        print(f"Got {len(chunk)} bytes")
        break

Open the system microphone, iterate raw 16-bit PCM chunks, break after the first. Replace break with your processing pipeline.

Async parallel

import asyncio
import decibri

async def main():
    async with await decibri.AsyncMicrophone.open(sample_rate=16000) as mic:
        async for chunk in mic:
            print(f"Got {len(chunk)} bytes")
            break

asyncio.run(main())

Same loop, but on the event loop. Use this in voice agents or websocket pipelines.

Compatibility

Python versions Platforms
3.10, 3.11, 3.12, 3.13, 3.14 Linux x64, Linux ARM64, macOS Apple Silicon, Windows x64

Pre-built wheels are published for every supported platform. pip install decibri fetches a binary wheel; no Rust toolchain, no C compiler, and no system audio headers are required at install time.

Wheels are built against the CPython stable ABI (abi3) with a 3.10 floor, so a single wheel per platform serves every supported interpreter version. New CPython releases work without a new decibri release as long as the stable ABI is preserved.

Microphone (sync)

Primary capture surface for synchronous code. Construct an instance, enter a with block (or call start() manually), then iterate or call read() for chunks.

Constructor

decibri.Microphone(
    sample_rate=16000,
    channels=1,
    frames_per_buffer=1600,
    dtype="int16",
    device=None,
    vad=False,
    vad_threshold=None,
    vad_holdoff_ms=300,
    model_path=None,
    as_ndarray=False,
    ort_library_path=None,
)
Parameter Type Default Description
sample_rate int 16000 Samples per second (1,000 to 384,000 Hz). 16,000 matches Silero VAD and most cloud STT providers; OpenAI Realtime requires 24,000.
channels int 1 Number of input channels (1 to 32).
frames_per_buffer int 1600 Frames per audio callback. 1,600 at 16 kHz is 100 ms chunks (64 to 65,536).
dtype "int16" | "float32" "int16" Sample encoding format.
device int | str | None system default Device index from Microphone.input_devices() or case-insensitive name substring.
vad False | "silero" | "energy" False Voice activity detector mode. See Voice activity detection.
vad_threshold float | None mode default Threshold in [0, 1]. Defaults to 0.5 in "silero" mode, 0.01 in "energy" mode.
vad_holdoff_ms int 300 Milliseconds of sub-threshold audio before is_speaking flips back to False.
model_path str | Path | None bundled Override path to a Silero VAD ONNX model. Only used when vad="silero"; defaults to the model bundled with the wheel.
as_ndarray bool False When True, read() returns a numpy.ndarray instead of bytes. Requires pip install decibri[numpy].
ort_library_path str | Path | None resolver Override path to the ONNX Runtime dynamic library. Only used when vad="silero". See ONNX Runtime resolution for the four-arm priority order.

Context manager usage

The canonical Python pattern. Entering the with block opens the stream and starts capture; exiting stops the stream and resets VAD state, even if an exception propagates out.

import decibri

with decibri.Microphone(sample_rate=16000, channels=1, frames_per_buffer=1600) as mic:
    for chunk in mic:
        process(chunk)
        if done():
            break

Calling start() manually is also supported when the context manager does not fit. Pair it with stop() in a try / finally.

Methods

mic.start()

Open and start the capture stream. Calling start() after stop() or close() is supported and reconstructs the stream cleanly; VAD state resets on each new start(). Calling start() on an already-running instance raises AlreadyRunning.

mic.stop()

Stop the capture stream and reset VAD state and the sequence counter. Idempotent; safe to call multiple times.

mic.close()

Alias for stop(). Provided for ergonomic parity with the asyncio / aiohttp / httpx convention. The two methods are currently equivalent and are intended to remain interchangeable.

mic.read(timeout_ms=None)

Read one chunk. Returns the chunk, or None if the stream closed. Return type is bytes by default, or numpy.ndarray when the Microphone was constructed with as_ndarray=True. Advances VAD state as a side effect when VAD is enabled.

mic.read_with_metadata(timeout_ms=None)

Read one chunk and return it as a frozen Chunk with .data, .timestamp, .sequence, .is_speaking, and .vad_score attributes. Returns None on clean stream close. See Value types.

mic.iter_with_metadata()

Generator yielding Chunk objects until the stream closes cleanly. Use this in place of for chunk in mic when you want metadata alongside the audio data.

with decibri.Microphone(vad="silero") as mic:
    for chunk in mic.iter_with_metadata():
        if chunk.is_speaking:
            send_to_stt(chunk.data)

iter(mic) and next(mic)

The Microphone is itself an iterator. for chunk in mic: yields the raw data shape (bytes or numpy.ndarray) and raises StopIteration when the stream closes.

Properties

mic.is_open

bool (read-only). Returns True while the capture stream is currently running.

mic.is_speaking

bool (read-only). Returns True while VAD considers the user to be speaking, including the holdoff grace period. Always False when vad=False. Holdoff expiry is checked on every property access, so consumers who pause iteration still observe correct state when they next read.

mic.vad_score

float in [0, 1], mode-agnostic. In vad="silero" mode this is the raw Silero probability for the most recent chunk; in vad="energy" mode it is the normalised RMS energy. Always 0.0 when vad=False.

Static methods

Microphone.input_devices()

Returns a list of DeviceInfo objects describing every input device recognised by the operating system.

for d in decibri.Microphone.input_devices():
    print(d.index, d.name, d.default_sample_rate)

Microphone.version()

Returns a VersionInfo object with the Rust core version, the audio backend version, and the binding wheel version.

All Microphone instances support repr() for debugging; the output includes sample rate, channels, dtype, frames per buffer, device, VAD mode, and open state.

Speaker (sync)

Audio output surface. Construct, enter a with block, write samples, and drain.

Constructor

decibri.Speaker(
    sample_rate=16000,
    channels=1,
    dtype="int16",
    device=None,
)
Parameter Type Default Description
sample_rate int 16000 Output sample rate in Hz (1,000 to 384,000). Use 24,000 for OpenAI Realtime playback.
channels int 1 Number of output channels (1 to 32). Multi-channel samples are interleaved on the wire.
dtype "int16" | "float32" "int16" Sample dtype. Must match the data passed to write(); mismatch raises TypeError.
device int | str | None system default Device index from Speaker.output_devices() or case-insensitive name substring.

Context manager usage

import decibri

with decibri.Speaker(sample_rate=16000, channels=1) as spk:
    spk.write(audio_bytes)
    spk.drain()

Methods

spk.start()

Open and start the output stream. Re-entry after stop() or close() is supported.

spk.stop()

Stop the output stream.

spk.close()

Alias for stop(). See Microphone.close() for the equivalence note.

spk.write(samples)

Write a chunk to the output stream. Accepts bytes or a numpy.ndarray with dtype matching the configured dtype. Multi-channel ndarrays use shape (N, channels). Output streams duck-type the input on each call rather than committing at construction time; mixing bytes and ndarrays across calls is supported. Raises TypeError on dtype mismatch or unsupported input.

spk.drain()

Block until all queued samples have been played. Useful at the end of a playback sequence so the program does not exit before the speaker buffer empties.

Properties

spk.is_playing

bool (read-only). Returns True while the output stream is currently running.

Static methods

Speaker.output_devices()

Returns a list of OutputDeviceInfo objects describing every output device recognised by the operating system.

AsyncMicrophone

When to use async

Reach for AsyncMicrophone when capture lives on an event loop: a voice agent that streams chunks to a websocket, a coroutine-based pipeline, or anywhere a sibling task may need to cancel the read in flight. The async classes serialise concurrent calls via a Rust-side Tokio mutex, so sibling-task cancellation is safe in a way that the sync Microphone does not provide.

Constructor

Parameters match Microphone exactly. See the Microphone section for the full parameter table.

mic = decibri.AsyncMicrophone(sample_rate=16000, vad="silero")
Note: AsyncMicrophone.version() is synchronous (no await) because it returns compile-time constants. Every other method on the class is a coroutine. Double-awaiting version() raises a confusing TypeError; call it without await.

await AsyncMicrophone.open(...)

Async factory classmethod. The synchronous constructor blocks for roughly 100 to 500 milliseconds when vad="silero" because it loads the Silero ONNX model inline. open() dispatches that load to loop.run_in_executor(None, ...) so the event loop keeps spinning while ORT initialises.

mic = await decibri.AsyncMicrophone.open(vad="silero")
async with mic:
    async for chunk in mic:
        await process(chunk)
Note: The canonical pattern is async with await AsyncMicrophone.open(...) as mic:. The open() factory is itself a coroutine, so it must be awaited before async with takes the resulting instance.

async with and async for

The async context manager opens the stream on entry and stops it on exit. Iteration via async for yields the same data shape as the sync Microphone (bytes by default, numpy.ndarray when as_ndarray=True).

async with await decibri.AsyncMicrophone.open(sample_rate=16000) as mic:
    async for chunk in mic:
        await websocket.send(chunk)

Methods

await mic.start()

Open and start the capture stream. Re-entry after stop() or close() is supported and resets VAD state.

await mic.stop()

Stop the capture stream and reset VAD state and the sequence counter.

await mic.close()

Alias for stop(). See Microphone.close() for the equivalence note.

await mic.read(timeout_ms=None)

Read one chunk. Returns the chunk, or None if the stream closed. Same return-type rules as the sync read().

await mic.read_with_metadata(timeout_ms=None)

Async parallel of Microphone.read_with_metadata(). Returns a frozen Chunk with metadata, or None on clean close.

mic.aiter_with_metadata()

Async-generator function yielding Chunk objects until the stream closes cleanly. Stops when the bridge returns None.

async with await decibri.AsyncMicrophone.open(vad="silero") as mic:
    async for chunk in mic.aiter_with_metadata():
        if chunk.is_speaking:
            await stt.send(chunk.data)
Note: aiter_with_metadata() is an async-generator function. The correct iteration pattern is async for chunk in mic.aiter_with_metadata():; do not await the call itself.

Properties

Properties on AsyncMicrophone are synchronous attribute access (no await). They are backed by lock-free atomic mirrors on the Rust bridge, so they report current truth even when the Rust side closes the stream itself (for example, device disconnect).

mic.is_open

bool (read-only). True while the capture stream is running.

mic.is_speaking

bool (read-only). Same semantics as the sync property: above-threshold detection plus holdoff. Always False when vad=False.

mic.vad_score

float in [0, 1]. Same semantics as the sync property.

Static methods

await AsyncMicrophone.input_devices()

Async parallel of Microphone.input_devices(). Returns a list of DeviceInfo.

AsyncMicrophone.version()

Synchronous (see the callout above). Returns a VersionInfo for the Rust core, the audio backend, and the binding wheel.

Cancellation semantics

Cancelling an awaited AsyncMicrophone call (via asyncio.CancelledError, asyncio.wait_for, or explicit task.cancel()) raises CancelledError immediately on the Python side. The Rust-side spawn_blocking thread completes on its own schedule and its result is dropped. The bridge state stays consistent for subsequent reads, so sibling-task cancellation while a read() is in flight is safe.

AsyncSpeaker

Asyncio mirror of Speaker. Same parameter set, same lifecycle, all methods coroutines.

Constructor

Parameters match Speaker exactly. See the Speaker section for the parameter table.

spk = decibri.AsyncSpeaker(sample_rate=24000, channels=1)

await AsyncSpeaker.open(...)

Async factory classmethod, symmetric with AsyncMicrophone.open. Speaker does not load ORT, so the event-loop blocking risk is smaller, but the factory is provided for API parity.

Note: Unlike AsyncMicrophone, the synchronous AsyncSpeaker(...) constructor does no heavy work, so calling it directly inside an async function is fine. open() is provided for symmetry; reach for it if you prefer the consistent factory pattern across both classes.

async with usage

async with decibri.AsyncSpeaker(sample_rate=24000) as spk:
    await spk.write(audio_bytes)
    await spk.drain()

Methods

await spk.start()

Open and start the output stream.

await spk.stop()

Stop the output stream.

await spk.close()

Alias for stop(). See Microphone.close() for the equivalence note.

await spk.write(samples)

Async parallel of Speaker.write. Accepts bytes or a numpy.ndarray with matching dtype.

await spk.drain()

Block until all queued samples have been played. Cancelling this await raises CancelledError immediately, but the audio continues to play until the output buffer empties on the callback's own schedule. For production code, complete drains before initiating new writes.

Properties

spk.is_playing

bool (read-only). Synchronous property backed by a lock-free atomic mirror on the bridge.

Static methods

await AsyncSpeaker.output_devices()

Async parallel of Speaker.output_devices().

Module-level helpers

Convenience entry points exposed directly on the decibri module.

decibri.input_devices()

Module-level shortcut for Microphone.input_devices(). Returns a list of DeviceInfo.

for d in decibri.input_devices():
    print(d.index, d.name)

decibri.output_devices()

Module-level shortcut for Speaker.output_devices(). Returns a list of OutputDeviceInfo.

decibri.version()

Returns a VersionInfo for the Rust core, the audio backend, and the binding wheel.

v = decibri.version()
print(v.decibri, v.audio_backend, v.binding)

decibri.record_to_file(path, duration_seconds, sample_rate=16000, channels=1, device=None)

Synchronous one-shot recorder. Captures duration_seconds of microphone audio to a 16-bit PCM WAV file. Wraps Microphone plus the standard library wave module. Frame-count termination guarantees an accurate duration even on platforms where the buffer hint is ignored by the OS audio subsystem.

decibri.record_to_file("clip.wav", duration_seconds=5.0)

await decibri.async_record_to_file(path, duration_seconds, sample_rate=16000, channels=1, device=None)

Async parallel of record_to_file. Same parameters, same semantics; await it from an asyncio context.

await decibri.async_record_to_file("clip.wav", duration_seconds=5.0)

Value types

Small typed return shapes used across the API.

Chunk

Frozen dataclass returned by read_with_metadata() and iter_with_metadata() on both Microphone and AsyncMicrophone.

Property Type Description
data bytes or numpy.ndarray Audio chunk. Shape matches the as_ndarray constructor flag.
timestamp float time.monotonic() snapshot at the chunk boundary, in seconds. Useful for relative timing within a session.
sequence int Per-session chunk counter starting at 0. Resets on each new start().
is_speaking bool VAD state snapshot at the chunk boundary. Always False when VAD is disabled.
vad_score float VAD score snapshot in [0, 1]. Always 0.0 when VAD is disabled.

DeviceInfo

Returned by Microphone.input_devices(), decibri.input_devices(), and await AsyncMicrophone.input_devices().

Property Type Description
index int Device index, usable as the device constructor argument.
name str Human-readable device name reported by the operating system.
id str Stable platform-specific device identifier.
max_input_channels int Maximum number of input channels the device supports.
default_sample_rate int The device's native or preferred sample rate in Hz.
is_default bool Whether this is the current system default input device.

OutputDeviceInfo

Returned by Speaker.output_devices(), decibri.output_devices(), and await AsyncSpeaker.output_devices().

Property Type Description
index int Device index, usable as the device constructor argument.
name str Human-readable device name reported by the operating system.
id str Stable platform-specific device identifier.
max_output_channels int Maximum number of output channels the device supports.
default_sample_rate int The device's native or preferred sample rate in Hz.
is_default bool Whether this is the current system default output device.

VersionInfo

Returned by Microphone.version(), AsyncMicrophone.version(), and decibri.version().

Property Type Description
decibri str Semver of the underlying Rust core.
audio_backend str Audio backend name and version (for example, "cpal 0.17").
binding str Semver of the Python binding wheel.

Voice activity detection

Decibri ships two VAD modes plus a disabled default. All three are selected with the vad constructor parameter on Microphone and AsyncMicrophone.

Mode Description Default threshold Threshold range
False VAD disabled. is_speaking always False, vad_score always 0.0. n/a n/a
"energy" Lightweight RMS-energy threshold computed in pure Python over each chunk. 0.01 0.0 to 1.0
"silero" ML-based detector using the bundled Silero ONNX model, run through ONNX Runtime. 0.5 0.0 to 1.0
Note: The bundled Silero VAD ONNX model and ONNX Runtime dylib ship inside the wheel. No separate download, no API keys, and no onnxruntime system dependency are required for vad="silero".

Holdoff and the is_speaking state machine

Decibri runs a pure-Python state machine on top of the raw VAD probability. Above-threshold chunks set the speaking state and cancel any pending silence timer. Below-threshold chunks while already speaking start a silence timer; the timer expires after vad_holdoff_ms elapsed real time, at which point is_speaking flips back to False. Timer expiry is checked on every property access via time.monotonic(), so consumers who pause iteration still observe correct state on the next read.

is_speaking vs vad_score

is_speaking is the debounced state machine output: above the threshold plus the holdoff grace. vad_score is the raw per-chunk view, identical to the Silero probability in Silero mode and the normalised RMS in energy mode. Use is_speaking for gating downstream work; use vad_score when you need the underlying signal (for example, to threshold differently per chunk or to log probability distributions).

Custom Silero model with model_path

The bundled Silero model is the published Silero v5 checkpoint. To use a different Silero ONNX variant, pass an absolute path on the model_path constructor parameter. The path is only consulted when vad="silero"; energy mode and vad=False ignore it.

mic = decibri.Microphone(vad="silero", model_path="/opt/models/silero_vad_v4.onnx")

Audio format

Decibri supports two on-the-wire sample formats, selected with the dtype constructor parameter. Both apply to Microphone, Speaker, and their async variants.

int16 (default)

Each two bytes represents one 16-bit signed integer sample, little-endian. Range: -32,768 to 32,767. Two bytes per sample. This is the format expected by most cloud STT providers and the wire format used by the record_to_file helpers.

with decibri.Microphone(dtype="int16") as mic:
    chunk = mic.read()  # bytes; len(chunk) == frames * channels * 2

float32

Each four bytes represents one 32-bit IEEE 754 float sample, little-endian. Range: approximately -1.0 to 1.0. Four bytes per sample. Use this when your downstream pipeline expects normalised floats and you would otherwise convert from int16.

with decibri.Microphone(dtype="float32") as mic:
    chunk = mic.read()  # bytes; len(chunk) == frames * channels * 4
Note: On Windows WASAPI, the audio backend often ignores the frames_per_buffer hint and delivers chunks sized to the OS device period instead. Frame-count loops still observe accurate total duration (see record_to_file); chunk-count loops can record more or less than requested. Prefer frame-count termination when an exact duration matters.

NumPy integration

Set as_ndarray=True on the Microphone constructor to receive numpy.ndarray instead of bytes from read(). The array's dtype matches the configured dtype (np.int16 or np.float32); the shape is 1-D (N,) for mono and 2-D (N, channels) for multi-channel (interleaved).

import decibri
import numpy as np

with decibri.Microphone(sample_rate=16000, dtype="float32", as_ndarray=True) as mic:
    chunk = mic.read()
    assert isinstance(chunk, np.ndarray)
    assert chunk.dtype == np.float32

Speaker.write() duck-types on each call: pass bytes or pass an ndarray with matching dtype. Mixing both within a single output session is supported.

The NumPy extra is opt-in to keep the default install lightweight:

$ pip install decibri[numpy]
Note: as_ndarray=True requires the NumPy extra. Reading from a Microphone constructed with as_ndarray=True on an install without the extra raises ImportError with the message numpy is not installed. Install with: pip install decibri[numpy].

Type hints and py.typed

The decibri wheel ships a py.typed marker file per PEP 561, with hand-written .pyi stubs covering the internal Rust extension module and the full exception hierarchy. The package is mypy strict-clean. IDEs and type checkers will autocomplete every public name and narrow return types correctly.

import decibri
import numpy as np

mic = decibri.Microphone(as_ndarray=True)
chunk = mic.read()
# Type checker narrows `chunk` to numpy.ndarray | None when as_ndarray=True
# and to bytes | None otherwise.

The package depends on typing-extensions at runtime to support the 3.10 abi3 floor (typing.Self is only available in 3.11 and newer).

Exception handling

Decibri raises typed exceptions instead of generic RuntimeError or Exception. The root of the hierarchy is DecibriError. Three intermediate parents (DeviceError, OrtError, OrtPathError) group related instance classes so callers can catch by category instead of by individual class. Every exception remains catchable as DecibriError.

Hierarchy

Exception class Parent Common cause
SampleRateOutOfRange DecibriError Constructor sample_rate outside the supported range.
ChannelsOutOfRange DecibriError Constructor channels outside the supported range.
FramesPerBufferOutOfRange DecibriError Constructor frames_per_buffer outside the supported range.
InvalidFormat DecibriError Constructor dtype not "int16" or "float32".
AlreadyRunning DecibriError start() called on an instance that is already capturing.
StreamOpenFailed DecibriError The audio stream failed to open.
StreamStartFailed DecibriError The audio stream opened but failed to start.
PermissionDenied DecibriError The operating system denied microphone access. Message includes platform-specific guidance.
CaptureStreamClosed DecibriError Read attempted on a closed capture stream (often a mid-stream device disconnect).
OutputStreamClosed DecibriError Write attempted on a closed output stream.
VadSampleRateUnsupported DecibriError VAD enabled with a sample rate the VAD model cannot accept.
VadThresholdOutOfRange DecibriError Constructor vad_threshold outside [0, 1].
ForkAfterOrtInit DecibriError Linux only. The current process inherited an ORT session from its parent across fork(). See Multiprocessing and asyncio caveats.
DeviceNotFound DeviceError The named input device does not match any device on the system.
OutputDeviceNotFound DeviceError The named output device does not match any device on the system.
MultipleDevicesMatch DeviceError The device name substring matches more than one device; use a more specific substring or the integer index.
DeviceIndexOutOfRange DeviceError The integer device index is out of range for the host audio API.
NoMicrophoneFound DeviceError The system reports zero input devices.
NoOutputDeviceFound DeviceError The system reports zero output devices.
NotAnInputDevice DeviceError The matched device exists but is not capable of input.
DeviceEnumerationFailed DeviceError The audio backend failed to enumerate devices.
OrtInitFailed OrtError ONNX Runtime initialisation itself failed (no specific path was supplied).
OrtSessionBuildFailed OrtError Building an ORT inference session failed.
OrtThreadsConfigFailed OrtError Configuring ORT thread pools failed.
VadModelLoadFailed OrtError Loading the Silero VAD ONNX model failed. Has a .path attribute.
OrtInferenceFailed OrtError ORT inference produced an error at runtime.
OrtTensorCreateFailed OrtError Creating an ORT input tensor failed.
OrtTensorExtractFailed OrtError Extracting values from an ORT output tensor failed.
OrtLoadFailed OrtPathError The supplied ORT dylib path passed the filesystem pre-check but ORT rejected it. Has a .path attribute.
OrtPathInvalid OrtPathError The supplied ORT dylib path failed the pre-check before ORT saw it. Has .path and .reason attributes.

Exceptions with extra attributes

Three classes carry additional attributes beyond the standard exception message:

Catch patterns

Catch any decibri error:

try:
    with decibri.Microphone(sample_rate=16000) as mic:
        chunk = mic.read()
except decibri.DecibriError as e:
    print(f"Decibri error: {e}")

Catch device-selection failures specifically, then fall back to the system default:

try:
    mic = decibri.Microphone(device="USB Audio")
except decibri.DeviceError as e:
    print(f"Device problem: {e}")
    mic = decibri.Microphone()

Multiprocessing and asyncio caveats

Linux fork-safety with Silero

Python's default fork start method on Linux duplicates the parent's memory into the child, but ONNX Runtime's internal state is not safe to share across forked processes. A Silero-enabled Microphone initialised in the parent and then used in a forked child either produces incorrect inference results or segfaults; decibri detects the PID mismatch at the start of every Silero inference call and raises ForkAfterOrtInit instead.

The fix is to either set the spawn start method before constructing any worker, or to construct the Microphone inside each child process after the fork.

import multiprocessing

if __name__ == "__main__":
    multiprocessing.set_start_method("spawn")
    # ... rest of program
Note: On Linux, when combining vad="silero" with multiprocessing, call multiprocessing.set_start_method("spawn") before spawning workers. The default fork start method shares ORT state across processes unsafely; decibri detects the mismatch and raises ForkAfterOrtInit. macOS already defaults to spawn; Windows always uses spawn. The setting only matters on Linux.

Async event-loop blocking on ORT init

The synchronous AsyncMicrophone(...) constructor blocks for roughly 100 to 500 milliseconds when vad="silero" because it loads the Silero ONNX model inline. In an async context this blocks the event loop for the duration of the load, which can cause dropped websocket frames, late timer callbacks, and UI jitter in voice agents. Use await AsyncMicrophone.open(...) instead; the factory dispatches the synchronous construction to loop.run_in_executor(None, ...) so the event loop keeps running.

For more multiprocessing recipes, see the Python multiprocessing guide in the repository.

ONNX Runtime resolution

When vad="silero" is requested, decibri needs to load the ONNX Runtime dynamic library. The path is resolved in this order, first match wins:

  1. The ort_library_path constructor argument, if supplied.
  2. The DECIBRI_ORT_DYLIB_PATH environment variable, for per-deployment overrides without code changes.
  3. The ORT_DYLIB_PATH environment variable, the upstream ort crate's standard convention, respected so existing bare-ort deployments keep working.
  4. The dylib bundled inside the wheel under decibri/_ort/. This is the default pip install decibri experience.

If none of the above resolve to a real file, ORT's default loader runs. That loader itself respects ORT_DYLIB_PATH if set after decibri import, so a late environment change still works as a last-resort fallback.

When ORT is loaded

Only when vad="silero". The other VAD modes ("energy", False) never touch ORT, so the resolver and bundled-dylib lookup are skipped entirely.

First-load cost

The first Microphone constructed with vad="silero" incurs roughly 100 to 500 milliseconds of cold load on most platforms. The cost is amortised across the rest of the process: subsequent Microphones (sync or async) reuse the same loaded ORT.

Process-global initialisation

The first Microphone that loads Silero determines the dylib for the whole process. Subsequent Microphone constructions inherit that initialisation regardless of their own ort_library_path argument. To switch dylibs, restart the process.