silero
Silero Voice Activity Detection (VAD) implementation for Pipecat.
This module provides a VAD analyzer based on the Silero VAD ONNX model, which can detect voice activity in audio streams with high accuracy. Supports 8kHz and 16kHz sample rates.
- class pipecat.audio.vad.silero.SileroOnnxModel(path, force_onnx_cpu=True)[source]
Bases:
objectONNX runtime wrapper for the Silero VAD model.
Provides voice activity detection using the pre-trained Silero VAD model with ONNX runtime for efficient inference. Handles model state management and input validation for audio processing.
- class pipecat.audio.vad.silero.SileroVADAnalyzer(*, sample_rate: int | None = None, params: VADParams | None = None)[source]
Bases:
VADAnalyzerVoice Activity Detection analyzer using the Silero VAD model.
Implements VAD analysis using the pre-trained Silero ONNX model for accurate voice activity detection. Supports 8kHz and 16kHz sample rates with automatic model state management and periodic resets.
- __init__(*, sample_rate: int | None = None, params: VADParams | None = None)[source]
Initialize the Silero VAD analyzer.
- Parameters:
sample_rate – Audio sample rate (8000 or 16000 Hz). If None, will be set later.
params – VAD parameters for detection thresholds and timing.
- set_sample_rate(sample_rate: int)[source]
Set the sample rate for audio processing.
- Parameters:
sample_rate – Audio sample rate (must be 8000 or 16000 Hz).
- Raises:
ValueError – If sample rate is not 8000 or 16000 Hz.