krisp_viva_vad
Krisp Voice Activity Detection (VAD) implementation for Pipecat.
This module provides a VAD analyzer based on the Krisp VIVA SDK, which can detect voice activity in audio streams with high accuracy. Supports 8kHz, 16kHz, 32kHz, 44.1kHz and 48kHz sample rates.
- class pipecat.audio.vad.krisp_viva_vad.KrispVivaVadAnalyzer(*, model_path: str | None = None, frame_duration: int = 10, sample_rate: int | None = None, params: VADParams | None = None)[source]
Bases:
VADAnalyzerVoice Activity Detection analyzer using the Krisp VIVA SDK.
- __init__(*, model_path: str | None = None, frame_duration: int = 10, sample_rate: int | None = None, params: VADParams | None = None)[source]
Initialize the Krisp VIVA VAD analyzer.
- Parameters:
model_path – Path to the Krisp model file (.kef extension). If None, uses KRISP_VIVA_VAD_MODEL_PATH environment variable.
frame_duration – Frame duration in milliseconds (default: 10ms).
sample_rate – Audio sample rate (must be 8000, 16000, 32000, 44100 or 48000 Hz). If None, will be set later.
params – VAD parameters for detection configuration.
- Raises:
ValueError – If model_path is not provided and KRISP_VIVA_VAD_MODEL_PATH is not set.
Exception – If model file doesn’t have .kef extension.
FileNotFoundError – If model file doesn’t exist.
- set_sample_rate(sample_rate: int)[source]
Set the sample rate for audio processing.
- Parameters:
sample_rate – Audio sample rate (must be 8000, 16000, 32000 or 48000 Hz).
- Raises:
ValueError – If sample rate is not 8000, 16000, 32000 or 48000 Hz.
RuntimeError – If VAD session creation fails.
- num_frames_required() int[source]
Get the number of audio frames required for analysis.
- Returns:
Number of frames (samples) needed for VAD processing based on current sample rate and frame duration.