user_stop
- class pipecat.turns.user_stop.BaseUserTurnStopStrategy(*, enable_user_speaking_frames: bool = True, **kwargs)[source]
Bases:
BaseObjectBase class for strategies that determine when the user stops speaking.
Subclasses should implement logic to detect when the user stops speaking. This could be based on analyzing incoming frames (such as transcriptions), conversation state, or other heuristics.
Events triggered by strategies:
on_push_frame: Indicates the strategy wants to push a frame.
on_user_turn_stopped: Signals that the user stopped speaking.
- __init__(*, enable_user_speaking_frames: bool = True, **kwargs)[source]
Initialize the base user turn stop strategy.
- Parameters:
enable_user_speaking_frames – If True, the aggregator will emit frames indicating when the user stops speaking. This is enabled by default, but you may want to disable it if another component (e.g., an STT service) is already generating these frames.
**kwargs – Additional keyword arguments.
- property task_manager: BaseTaskManager
Returns the configured task manager.
- async setup(task_manager: BaseTaskManager)[source]
Initialize the strategy with the given task manager.
- Parameters:
task_manager – The task manager to be associated with this instance.
- async process_frame(frame: Frame) ProcessFrameResult | None[source]
Process an incoming frame to decide whether the user stopped speaking.
Subclasses should override this to implement logic that decides whether the user has stopped speaking.
- Parameters:
frame – The frame to be analyzed.
- Returns:
A ProcessFrameResult indicating the outcome, or None (treated as CONTINUE for backward compatibility).
- async push_frame(frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM)[source]
Emit on_push_frame to push a frame using the user aggreagtor.
- Parameters:
frame – The frame to be pushed.
direction – What direction the frame should be pushed to.
- class pipecat.turns.user_stop.ExternalUserTurnStopStrategy(*, timeout: float = 0.5, **kwargs)[source]
Bases:
BaseUserTurnStopStrategyUser turn stop strategy controlled by an external processor.
This strategy does not determine when a user turn ends on its own, it relies on a different processor in the pipeline which is responsible for emitting UserStoppedSpeakingFrame frames.
- __init__(*, timeout: float = 0.5, **kwargs)[source]
Initialize the external user turn stop strategy.
- Parameters:
timeout – A short delay used internally to handle consecutive or slightly delayed transcriptions.
**kwargs – Additional keyword arguments.
- async setup(task_manager: BaseTaskManager)[source]
Initialize the strategy with the given task manager.
- Parameters:
task_manager – The task manager to be associated with this instance.
- async process_frame(frame: Frame) ProcessFrameResult[source]
Process an incoming frame to update strategy state.
Updates internal transcription text and VAD state. The user end turn will be triggered when appropriate based on the collected frames.
- Parameters:
frame – The frame to be analyzed.
- Returns:
Always returns CONTINUE so subsequent stop strategies are evaluated.
- class pipecat.turns.user_stop.SpeechTimeoutUserTurnStopStrategy(*, user_speech_timeout: float = 0.6, **kwargs)[source]
Bases:
BaseUserTurnStopStrategyUser turn stop strategy using two independent timers after VAD stop.
After the user stops speaking (detected by VAD), this strategy runs two independent timers. The user turn stop is triggered only when both have finished and at least one transcript has been received:
user_speech_timeout: Policy floor — the window in which the user may resume speaking after a pause. Always runs to completion.
stt_timeout: Safety net for STT latency — the P99 time for the STT service to return a final transcript after VAD stop, adjusted by the VAD stop_secs. Short-circuited when the STT service emits a finalized transcript (TranscriptionFrame.finalized=True), since finalization means STT has nothing more to send.
Fallback: when a transcript arrives without a VAD stop event, the user_speech_timeout timer measures inactivity since the last transcript (rearmed on each transcript). stt_timeout has no meaning here since it is defined relative to VAD stop, and STT has already emitted a transcript — so the stt wait is marked done immediately.
- __init__(*, user_speech_timeout: float = 0.6, **kwargs)[source]
Initialize the speech timeout-based user turn stop strategy.
- Parameters:
user_speech_timeout – Time to wait for the user to potentially say more after they pause speaking. Defaults to 0.6 seconds.
**kwargs – Additional keyword arguments.
- async setup(task_manager: BaseTaskManager)[source]
Initialize the strategy with the given task manager.
- Parameters:
task_manager – The task manager to be associated with this instance.
- async process_frame(frame: Frame) ProcessFrameResult[source]
Process an incoming frame to update strategy state.
Updates internal transcription text and VAD state. The user end turn will be triggered when appropriate based on the collected frames.
- Parameters:
frame – The frame to be analyzed.
- Returns:
Always returns CONTINUE so subsequent stop strategies are evaluated.
- class pipecat.turns.user_stop.UserTurnStoppedParams(enable_user_speaking_frames: bool)[source]
Bases:
objectParameters emitted when a user turn stops.
These parameters are passed to the on_user_turn_stopped event and provide contextual information about how the end of user turn should be handled by the user aggregator.
- Parameters:
enable_user_speaking_frames – Whether the user aggregator should emit frames indicating user speaking state (e.g., user stopped speaking). This is typically enabled by default, but may be disabled when another component (such as an STT service) is already responsible for generating user speaking frames.
- enable_user_speaking_frames: bool
- class pipecat.turns.user_stop.TurnAnalyzerUserTurnStopStrategy(*, turn_analyzer: BaseTurnAnalyzer, **kwargs)[source]
Bases:
BaseUserTurnStopStrategyUser turn stop strategy that uses a turn detection model to determine if the user is done speaking.
This strategy feeds audio, VAD, and transcription frames to a turn detection model (
BaseTurnAnalyzer) that predicts when the user has finished their turn. Once the model indicates the turn is complete, the strategy waits for a final transcription before triggering the end of the user’s turn.For services that support finalization (TranscriptionFrame.finalized=True), the turn can be triggered immediately once the finalized transcript is received. Otherwise, an STT timeout (adjusted by VAD stop_secs) is used as a fallback.
- __init__(*, turn_analyzer: BaseTurnAnalyzer, **kwargs)[source]
Initialize the user turn stop strategy.
- Parameters:
turn_analyzer – The turn detection analyzer instance to detect end of user turn.
**kwargs – Additional keyword arguments.
- async setup(task_manager: BaseTaskManager)[source]
Initialize the strategy with the given task manager.
- Parameters:
task_manager – The task manager to be associated with this instance.
- async process_frame(frame: Frame) ProcessFrameResult[source]
Process an incoming frame to update the turn analyzer and strategy state.
- Parameters:
frame – The frame to be analyzed.
- Returns:
Always returns CONTINUE so subsequent stop strategies are evaluated.
Submodules
- base_user_turn_stop_strategy
UserTurnStoppedParamsBaseUserTurnStopStrategyBaseUserTurnStopStrategy.__init__()BaseUserTurnStopStrategy.task_managerBaseUserTurnStopStrategy.setup()BaseUserTurnStopStrategy.cleanup()BaseUserTurnStopStrategy.reset()BaseUserTurnStopStrategy.process_frame()BaseUserTurnStopStrategy.push_frame()BaseUserTurnStopStrategy.broadcast_frame()BaseUserTurnStopStrategy.trigger_user_turn_stopped()
- external_user_turn_stop_strategy
- speech_timeout_user_turn_stop_strategy
- turn_analyzer_user_turn_stop_strategy