base_smart_turn
Smart turn analyzer base class using ML models for end-of-turn detection.
This module provides the base implementation for smart turn analyzers that use machine learning models to determine when a user has finished speaking, going beyond simple silence-based detection.
- class pipecat.audio.turn.smart_turn.base_smart_turn.SmartTurnParams(*, stop_secs: float = 3, pre_speech_ms: float = 500, max_duration_secs: float = 8)[source]
Bases:
BaseTurnParamsConfiguration parameters for smart turn analysis.
- Parameters:
stop_secs – Maximum silence duration in seconds before ending turn.
pre_speech_ms – Milliseconds of audio to include before speech starts.
max_duration_secs – Maximum duration in seconds for audio segments.
- stop_secs: float
- pre_speech_ms: float
- max_duration_secs: float
- exception pipecat.audio.turn.smart_turn.base_smart_turn.SmartTurnTimeoutException[source]
Bases:
ExceptionException raised when smart turn analysis times out.
- class pipecat.audio.turn.smart_turn.base_smart_turn.BaseSmartTurn(*, sample_rate: int | None = None, params: SmartTurnParams | None = None)[source]
Bases:
BaseTurnAnalyzerBase class for smart turn analyzers using ML models.
Provides common functionality for smart turn detection including audio buffering, speech tracking, and ML model integration. Subclasses must implement the specific model prediction logic.
- __init__(*, sample_rate: int | None = None, params: SmartTurnParams | None = None)[source]
Initialize the smart turn analyzer.
- Parameters:
sample_rate – Optional sample rate for audio processing.
params – Configuration parameters for turn analysis behavior.
- property speech_triggered: bool
Check if speech has been detected and triggered analysis.
- Returns:
True if speech has been detected and turn analysis is active.
- property params: SmartTurnParams
Get the current smart turn parameters.
- Returns:
Current smart turn configuration parameters.
- append_audio(buffer: bytes, is_speech: bool) EndOfTurnState[source]
Append audio data for turn analysis.
- Parameters:
buffer – Raw audio data bytes to append for analysis.
is_speech – Whether the audio buffer contains detected speech.
- Returns:
Current end-of-turn state after processing the audio.
- async analyze_end_of_turn() tuple[EndOfTurnState, MetricsData | None][source]
Analyze the current audio state to determine if turn has ended.
- Returns:
Tuple containing the end-of-turn state and optional metrics data from the ML model analysis.