aic_filter

ai-coustics AIC SDK audio filter for Pipecat.

This module provides an audio filter implementation using ai-coustics’ AIC SDK to enhance audio streams in real time. It mirrors the structure of other filters like the Koala filter and integrates with Pipecat’s input transport pipeline.

Classes:: AICFilter: For aic-sdk (uses ‘aic_sdk’ module) AICModelManager: Singleton manager for read-only AIC Model instances.

class pipecat.audio.filters.aic_filter.AICModelManager[source]

Bases: object

Singleton manager for read-only AIC Model instances with reference counting.

Caches Model instances by path or (model_id + download_dir). Multiple AICFilter instances using the same model share one Model; the manager acquires on first use and releases when the last reference is dropped.

async classmethod acquire(*, model_path: Path | None = None, model_id: str | None = None, model_download_dir: Path | None = None) → tuple[Model, str][source]

Get or load a Model and increment its reference count.

Call this when starting a filter. Store the returned key and pass it to release() when stopping the filter.

Parameters:

model_path – Path to a local .aicmodel file. If set, model_id is ignored.
model_id – Model identifier to download from CDN.
model_download_dir – Directory for downloading models. Required if model_id is used.

Returns:

Tuple of (shared Model instance, cache key for release).

Raises:

ValueError – If neither model_path nor (model_id + model_download_dir) is provided, or if model_id is set without model_download_dir.

classmethod release(key: str) → None[source]

Release a reference to a cached model.

Call this when stopping a filter, with the key returned from get_model(). When the last reference is released, the model is removed from the cache.

Parameters:: key – Cache key returned by get_model().

class pipecat.audio.filters.aic_filter.AICFilter(*, license_key: str, model_id: str | None = None, model_path: Path | None = None, model_download_dir: Path | None = None, enhancement_level: float | None = None)[source]

Bases: BaseAudioFilter

Audio filter using ai-coustics’ AIC SDK for real-time enhancement.

Buffers incoming audio to the model’s preferred block size and processes frames using float32 samples normalized to the range -1 to +1.

__init__(*, license_key: str, model_id: str | None = None, model_path: Path | None = None, model_download_dir: Path | None = None, enhancement_level: float | None = None) → None[source]

Initialize the AIC filter.

Parameters:

license_key – ai-coustics license key for authentication.
model_id – Model identifier to download from CDN. Required if model_path is not provided. See https://artifacts.ai-coustics.io/ for available models.
model_path – Optional path to a local .aicmodel file. If provided, model_id is ignored and no download occurs.
model_download_dir – Directory for downloading models as a Path object. Defaults to a cache directory in user’s home folder.
enhancement_level – Optional overall enhancement strength (0.0..1.0). If None, the model default is used.

Raises:

ValueError – If neither model_id nor model_path is provided, or if enhancement_level is out of range.

get_vad_context()[source]

Return the VAD context once the processor exists.

Returns:: The VadContext instance bound to the underlying processor. Raises RuntimeError if the processor has not been initialized.

create_vad_analyzer(*, speech_hold_duration: float | None = None, minimum_speech_duration: float | None = None, sensitivity: float | None = None)[source]

Return an analyzer that will lazily instantiate the AIC VAD when ready.

AIC VAD parameters:

speech_hold_duration:
How long VAD continues detecting after speech ends (in seconds). Range: 0.0 to 100x model window length, Default (SDK): 0.05s
minimum_speech_duration:
Minimum duration of speech required before VAD reports speech detected (in seconds). Range: 0.0 to 1.0, Default (SDK): 0.0s
sensitivity:
Energy threshold sensitivity. Energy threshold = 10 ** (-sensitivity). Range: 1.0 to 15.0, Default (SDK): 6.0

Parameters:

speech_hold_duration – Optional speech hold duration to configure on the VAD. If None, SDK default (0.05s) is used.
minimum_speech_duration – Optional minimum speech duration before VAD reports speech detected. If None, SDK default (0.0s) is used.
sensitivity – Optional sensitivity (energy threshold) to configure on the VAD. Range: 1.0 to 15.0. If None, SDK default (6.0) is used.

Returns:

A lazily-initialized AICVADAnalyzer that will bind to the VAD context once the filter’s processor has been created (after start(sample_rate)).

async start(sample_rate: int)[source]

Initialize the filter with the transport’s sample rate.

Parameters:: sample_rate – The sample rate of the input transport in Hz.
Returns:: None

async stop()[source]

Clean up the AIC processor when stopping.

Returns:: None

async process_frame(frame: FilterControlFrame)[source]

Process control frames to enable/disable filtering.

Parameters:: frame – The control frame containing filter commands.
Returns:: None

async filter(audio: bytes) → bytes[source]

Apply AIC enhancement to audio data.

Buffers incoming audio and processes it in chunks that match the AIC model’s required block length. Returns enhanced audio data.

Parameters:: audio – Raw audio data as bytes (int16 PCM).
Returns:: Enhanced audio data as bytes (int16 PCM).