vad_processor
Voice Activity Detection processor for detecting speech in audio streams.
This module provides a VADProcessor that wraps a VADController to process audio frames and push VAD-related frames into the pipeline.
- class pipecat.processors.audio.vad_processor.VADProcessor(*, vad_analyzer: VADAnalyzer, speech_activity_period: float = 0.2, audio_idle_timeout: float = 1.0, **kwargs)[source]
Bases:
FrameProcessorProcesses audio frames through voice activity detection.
This processor wraps a VADController to detect speech in audio streams and push VAD frames into the pipeline:
VADUserStartedSpeakingFrame: Pushed when speech begins.VADUserStoppedSpeakingFrame: Pushed when speech ends.UserSpeakingFrame: Pushed periodically while speech is detected.
Example:
vad_processor = VADProcessor(vad_analyzer=SileroVADAnalyzer())
- __init__(*, vad_analyzer: VADAnalyzer, speech_activity_period: float = 0.2, audio_idle_timeout: float = 1.0, **kwargs)[source]
Initialize the VAD processor.
- Parameters:
vad_analyzer – The VADAnalyzer instance for processing audio.
speech_activity_period – Minimum interval in seconds between UserSpeakingFrame pushes. Defaults to 0.2.
audio_idle_timeout – Timeout in seconds to force speech stop when no audio frames are received while in SPEAKING state. Set to 0 to disable. Defaults to 1.0.
**kwargs – Additional arguments passed to parent class.
- async process_frame(frame: Frame, direction: FrameDirection)[source]
Process a frame through VAD and forward it.
- Parameters:
frame – The frame to process.
direction – The direction of frame flow in the pipeline.