krisp_viva_turn

Krisp turn analyzer for end-of-turn detection using Krisp VIVA SDK.

This module provides a turn analyzer implementation using Krisp’s turn detection v3 (Tt) API to determine when a user has finished speaking in a conversation. The Tt API accepts an external VAD flag alongside audio frames, allowing the model to leverage voice activity information for more accurate turn detection.

Note: This analyzer uses a different model than KrispVivaFilter. The model path can be specified via the KRISP_VIVA_TURN_MODEL_PATH environment variable or passed directly to the constructor.

class pipecat.audio.turn.krisp_viva_turn.KrispTurnParams(*, threshold: float = 0.5, frame_duration_ms: int = 20)[source]

Bases: BaseTurnParams

Configuration parameters for Krisp turn analysis.

Parameters:
  • threshold – Probability threshold for turn completion (0.0 to 1.0). Higher values require more confidence before marking turn as complete.

  • frame_duration_ms – Frame duration in milliseconds for turn detection. Supported values: 10, 15, 20, 30, 32.

threshold: float
frame_duration_ms: int
class pipecat.audio.turn.krisp_viva_turn.KrispVivaTurn(*, model_path: str | None = None, sample_rate: int | None = None, params: KrispTurnParams | None = None, api_key: str = '')[source]

Bases: BaseTurnAnalyzer

Turn analyzer using Krisp VIVA SDK for end-of-turn detection.

Uses Krisp’s turn detection v3 (Tt) API to determine when a user has finished speaking. The Tt API receives an external VAD flag with each audio frame, which the is_speech parameter of append_audio provides. This analyzer requires a valid Krisp model file to operate.

__init__(*, model_path: str | None = None, sample_rate: int | None = None, params: KrispTurnParams | None = None, api_key: str = '') None[source]

Initialize the Krisp turn analyzer.

Parameters:
  • model_path – Path to the Krisp turn detection model file (.kef extension). If None, uses KRISP_VIVA_TURN_MODEL_PATH environment variable.

  • sample_rate – Optional initial sample rate for audio processing. If provided, this will be used as the fixed sample rate.

  • params – Configuration parameters for turn analysis behavior.

  • api_key – Krisp SDK API key. If empty, falls back to the KRISP_VIVA_API_KEY environment variable.

Raises:
  • ValueError – If model_path is not provided and KRISP_VIVA_TURN_MODEL_PATH is not set.

  • Exception – If model file doesn’t have .kef extension.

  • FileNotFoundError – If model file doesn’t exist.

  • RuntimeError – If Krisp SDK initialization fails.

async cleanup()[source]

Release SDK reference when analyzer is destroyed.

set_sample_rate(sample_rate: int)[source]

Set the sample rate and create/update the turn detection session.

Parameters:

sample_rate – The sample rate to set.

property frame_probabilities: list

Get all probabilities from the last append_audio call.

Returns:

List of probability values for each frame processed in the last append_audio call.

property last_probability: float | None

Get the last turn probability value computed.

Returns:

Last probability value, or None if no frames have been processed yet.

property speech_triggered: bool

Check if speech has been detected and triggered analysis.

Returns:

True if speech has been detected and turn analysis is active.

property params: KrispTurnParams

Get the current turn analyzer parameters.

Returns:

Current turn analyzer configuration parameters.

append_audio(buffer: bytes, is_speech: bool) EndOfTurnState[source]

Append audio data for turn analysis.

Parameters:
  • buffer – Raw audio data bytes to append for analysis.

  • is_speech – Whether the audio buffer contains detected speech.

Returns:

Current end-of-turn state after processing the audio.

async analyze_end_of_turn() tuple[EndOfTurnState, MetricsData | None][source]

Analyze the current audio state to determine if turn has ended.

Returns:

Tuple containing the end-of-turn state and optional metrics data. Returns the last state determined by append_audio().

clear()[source]

Reset the turn analyzer to its initial state.