aic_filter
ai-coustics AIC SDK audio filter for Pipecat.
This module provides an audio filter implementation using ai-coustics’ AIC SDK to enhance audio streams in real time. It mirrors the structure of other filters like the Koala filter and integrates with Pipecat’s input transport pipeline.
- Classes:
AICFilter: For aic-sdk (uses ‘aic_sdk’ module) AICModelManager: Singleton manager for read-only AIC Model instances.
- class pipecat.audio.filters.aic_filter.AICModelManager[source]
Bases:
objectSingleton manager for read-only AIC Model instances with reference counting.
Caches Model instances by path or (model_id + download_dir). Multiple AICFilter instances using the same model share one Model; the manager acquires on first use and releases when the last reference is dropped.
- async classmethod acquire(*, model_path: Path | None = None, model_id: str | None = None, model_download_dir: Path | None = None) tuple[Model, str][source]
Get or load a Model and increment its reference count.
Call this when starting a filter. Store the returned key and pass it to release() when stopping the filter.
- Parameters:
model_path – Path to a local .aicmodel file. If set, model_id is ignored.
model_id – Model identifier to download from CDN.
model_download_dir – Directory for downloading models. Required if model_id is used.
- Returns:
Tuple of (shared Model instance, cache key for release).
- Raises:
ValueError – If neither model_path nor (model_id + model_download_dir) is provided, or if model_id is set without model_download_dir.
- class pipecat.audio.filters.aic_filter.AICFilter(*, license_key: str, model_id: str | None = None, model_path: Path | None = None, model_download_dir: Path | None = None, enhancement_level: float | None = None)[source]
Bases:
BaseAudioFilterAudio filter using ai-coustics’ AIC SDK for real-time enhancement.
Buffers incoming audio to the model’s preferred block size and processes frames using float32 samples normalized to the range -1 to +1.
- __init__(*, license_key: str, model_id: str | None = None, model_path: Path | None = None, model_download_dir: Path | None = None, enhancement_level: float | None = None) None[source]
Initialize the AIC filter.
- Parameters:
license_key – ai-coustics license key for authentication.
model_id – Model identifier to download from CDN. Required if model_path is not provided. See https://artifacts.ai-coustics.io/ for available models.
model_path – Optional path to a local .aicmodel file. If provided, model_id is ignored and no download occurs.
model_download_dir – Directory for downloading models as a Path object. Defaults to a cache directory in user’s home folder.
enhancement_level – Optional overall enhancement strength (0.0..1.0). If None, the model default is used.
- Raises:
ValueError – If neither model_id nor model_path is provided, or if enhancement_level is out of range.
- get_vad_context()[source]
Return the VAD context once the processor exists.
- Returns:
The VadContext instance bound to the underlying processor. Raises RuntimeError if the processor has not been initialized.
- create_vad_analyzer(*, speech_hold_duration: float | None = None, minimum_speech_duration: float | None = None, sensitivity: float | None = None)[source]
Return an analyzer that will lazily instantiate the AIC VAD when ready.
- AIC VAD parameters:
- speech_hold_duration:
How long VAD continues detecting after speech ends (in seconds). Range: 0.0 to 100x model window length, Default (SDK): 0.05s
- minimum_speech_duration:
Minimum duration of speech required before VAD reports speech detected (in seconds). Range: 0.0 to 1.0, Default (SDK): 0.0s
- sensitivity:
Energy threshold sensitivity. Energy threshold = 10 ** (-sensitivity). Range: 1.0 to 15.0, Default (SDK): 6.0
- Parameters:
speech_hold_duration – Optional speech hold duration to configure on the VAD. If None, SDK default (0.05s) is used.
minimum_speech_duration – Optional minimum speech duration before VAD reports speech detected. If None, SDK default (0.0s) is used.
sensitivity – Optional sensitivity (energy threshold) to configure on the VAD. Range: 1.0 to 15.0. If None, SDK default (6.0) is used.
- Returns:
A lazily-initialized AICVADAnalyzer that will bind to the VAD context once the filter’s processor has been created (after start(sample_rate)).
- async start(sample_rate: int)[source]
Initialize the filter with the transport’s sample rate.
- Parameters:
sample_rate – The sample rate of the input transport in Hz.
- Returns:
None
- async process_frame(frame: FilterControlFrame)[source]
Process control frames to enable/disable filtering.
- Parameters:
frame – The control frame containing filter commands.
- Returns:
None
- async filter(audio: bytes) bytes[source]
Apply AIC enhancement to audio data.
Buffers incoming audio and processes it in chunks that match the AIC model’s required block length. Returns enhanced audio data.
- Parameters:
audio – Raw audio data as bytes (int16 PCM).
- Returns:
Enhanced audio data as bytes (int16 PCM).