krisp_viva_turn
Krisp turn analyzer for end-of-turn detection using Krisp VIVA SDK.
This module provides a turn analyzer implementation using Krisp’s turn detection v3 (Tt) API to determine when a user has finished speaking in a conversation. The Tt API accepts an external VAD flag alongside audio frames, allowing the model to leverage voice activity information for more accurate turn detection.
Note: This analyzer uses a different model than KrispVivaFilter. The model path can be specified via the KRISP_VIVA_TURN_MODEL_PATH environment variable or passed directly to the constructor.
- class pipecat.audio.turn.krisp_viva_turn.KrispTurnParams(*, threshold: float = 0.5, frame_duration_ms: int = 20)[source]
Bases:
BaseTurnParamsConfiguration parameters for Krisp turn analysis.
- Parameters:
threshold – Probability threshold for turn completion (0.0 to 1.0). Higher values require more confidence before marking turn as complete.
frame_duration_ms – Frame duration in milliseconds for turn detection. Supported values: 10, 15, 20, 30, 32.
- threshold: float
- frame_duration_ms: int
- class pipecat.audio.turn.krisp_viva_turn.KrispVivaTurn(*, model_path: str | None = None, sample_rate: int | None = None, params: KrispTurnParams | None = None, api_key: str = '')[source]
Bases:
BaseTurnAnalyzerTurn analyzer using Krisp VIVA SDK for end-of-turn detection.
Uses Krisp’s turn detection v3 (Tt) API to determine when a user has finished speaking. The Tt API receives an external VAD flag with each audio frame, which the
is_speechparameter ofappend_audioprovides. This analyzer requires a valid Krisp model file to operate.- __init__(*, model_path: str | None = None, sample_rate: int | None = None, params: KrispTurnParams | None = None, api_key: str = '') None[source]
Initialize the Krisp turn analyzer.
- Parameters:
model_path – Path to the Krisp turn detection model file (.kef extension). If None, uses KRISP_VIVA_TURN_MODEL_PATH environment variable.
sample_rate – Optional initial sample rate for audio processing. If provided, this will be used as the fixed sample rate.
params – Configuration parameters for turn analysis behavior.
api_key – Krisp SDK API key. If empty, falls back to the KRISP_VIVA_API_KEY environment variable.
- Raises:
ValueError – If model_path is not provided and KRISP_VIVA_TURN_MODEL_PATH is not set.
Exception – If model file doesn’t have .kef extension.
FileNotFoundError – If model file doesn’t exist.
RuntimeError – If Krisp SDK initialization fails.
- set_sample_rate(sample_rate: int)[source]
Set the sample rate and create/update the turn detection session.
- Parameters:
sample_rate – The sample rate to set.
- property frame_probabilities: list
Get all probabilities from the last append_audio call.
- Returns:
List of probability values for each frame processed in the last append_audio call.
- property last_probability: float | None
Get the last turn probability value computed.
- Returns:
Last probability value, or None if no frames have been processed yet.
- property speech_triggered: bool
Check if speech has been detected and triggered analysis.
- Returns:
True if speech has been detected and turn analysis is active.
- property params: KrispTurnParams
Get the current turn analyzer parameters.
- Returns:
Current turn analyzer configuration parameters.
- append_audio(buffer: bytes, is_speech: bool) EndOfTurnState[source]
Append audio data for turn analysis.
- Parameters:
buffer – Raw audio data bytes to append for analysis.
is_speech – Whether the audio buffer contains detected speech.
- Returns:
Current end-of-turn state after processing the audio.
- async analyze_end_of_turn() tuple[EndOfTurnState, MetricsData | None][source]
Analyze the current audio state to determine if turn has ended.
- Returns:
Tuple containing the end-of-turn state and optional metrics data. Returns the last state determined by append_audio().