user_bot_latency_observer

Observer for tracking user-to-bot response latency.

This module provides an observer that monitors the time between when a user stops speaking and when the bot starts speaking, emitting events when latency is measured. Optionally collects per-service latency breakdown metrics (TTFB, text aggregation) when enable_metrics=True.

class pipecat.observers.user_bot_latency_observer.TTFBBreakdownMetrics(*, processor: str, model: str | None = None, start_time: float, duration_secs: float)[source]

Bases: BaseModel

TTFB measurement with timestamp for timeline placement.

Parameters:
  • processor – Name of the processor that reported the TTFB.

  • model – Optional model name associated with the metric.

  • start_time – Unix timestamp when the TTFB measurement started.

  • duration_secs – TTFB duration in seconds.

processor: str
model: str | None
start_time: float
duration_secs: float
class pipecat.observers.user_bot_latency_observer.TextAggregationBreakdownMetrics(*, processor: str, start_time: float, duration_secs: float)[source]

Bases: BaseModel

Text aggregation measurement with timestamp for timeline placement.

Parameters:
  • processor – Name of the processor that reported the metric.

  • start_time – Unix timestamp when text aggregation started.

  • duration_secs – Aggregation duration in seconds.

processor: str
start_time: float
duration_secs: float
class pipecat.observers.user_bot_latency_observer.FunctionCallMetrics(*, function_name: str, start_time: float, duration_secs: float)[source]

Bases: BaseModel

Latency for a single function call execution.

Parameters:
  • function_name – Name of the function that was called.

  • start_time – Unix timestamp when execution started.

  • duration_secs – Time in seconds from execution start to result.

function_name: str
start_time: float
duration_secs: float
class pipecat.observers.user_bot_latency_observer.LatencyBreakdown(*, ttfb: list[TTFBBreakdownMetrics] = <factory>, text_aggregation: TextAggregationBreakdownMetrics | None = None, user_turn_start_time: float | None = None, user_turn_secs: float | None = None, function_calls: list[FunctionCallMetrics] = <factory>)[source]

Bases: BaseModel

Per-service latency breakdown for a single user-to-bot cycle.

Collected between VADUserStoppedSpeakingFrame and BotStartedSpeakingFrame when enable_metrics=True in PipelineParams.

Parameters:
  • ttfb – Time-to-first-byte metrics from each service in the pipeline.

  • text_aggregation – First text aggregation measurement, representing the latency cost of sentence aggregation in the TTS pipeline.

  • user_turn_start_time – Unix timestamp when the user turn started (actual user silence, adjusted for VAD stop_secs). None if no VADUserStoppedSpeakingFrame was observed.

  • user_turn_secs – Duration in seconds of the user’s turn, measured from when the user actually stopped speaking to when the turn was released (UserStoppedSpeakingFrame). This includes VAD silence detection, STT finalization, and any turn analyzer wait. None if no UserStoppedSpeakingFrame was observed (e.g. no turn analyzer configured).

  • function_calls – Latency for each function call executed during this cycle. Empty if no function calls occurred.

ttfb: list[TTFBBreakdownMetrics]
text_aggregation: TextAggregationBreakdownMetrics | None
user_turn_start_time: float | None
user_turn_secs: float | None
function_calls: list[FunctionCallMetrics]
chronological_events() list[str][source]

Return human-readable event labels sorted by start time.

Collects all sub-metrics into a flat list, sorts by start_time, and returns formatted strings suitable for logging.

Returns:

List of formatted strings, one per event, in chronological order.

class pipecat.observers.user_bot_latency_observer.UserBotLatencyObserver(*, max_frames=100, **kwargs)[source]

Bases: BaseObserver

Observer that tracks user-to-bot response latency.

Measures the time between when a user stops speaking (VADUserStoppedSpeakingFrame) and when the bot starts speaking (BotStartedSpeakingFrame). Emits events when latency is measured, allowing consumers to log, trace, or otherwise process the latency data.

When enable_metrics=True in pipeline params, also collects per-service latency breakdown (TTFB, text aggregation) and emits an on_latency_breakdown event alongside the existing latency measurement.

This observer follows the composition pattern used by TurnTrackingObserver, acting as a reusable component for latency measurement.

Events:
on_latency_measured(observer, latency_seconds): Emitted when

time-to-first-bot-speech is calculated. Measures the time from when the user stopped speaking to when the bot starts speaking.

on_latency_breakdown(observer, breakdown): Emitted at each

BotStartedSpeakingFrame with a LatencyBreakdown containing per-service metrics collected during the user→bot cycle.

on_first_bot_speech_latency(observer, latency_seconds): Emitted once,

the first time BotStartedSpeakingFrame arrives after ClientConnectedFrame. Measures the time from client connection to the first bot speech.

__init__(*, max_frames=100, **kwargs)[source]

Initialize the user-bot latency observer.

Sets up tracking for processed frames and user speech timing to calculate response latencies.

Parameters:
  • max_frames – Maximum number of frame IDs to keep in history for duplicate detection. Defaults to 100.

  • **kwargs – Additional arguments passed to parent class.

async on_push_frame(data: FramePushed)[source]

Process frames to track speech timing and calculate latency.

Tracks VAD events and bot speaking events to measure the time between user stopping speech and bot starting speech. Also accumulates metrics from MetricsFrame for the latency breakdown.

Parameters:

data – Frame push event containing the frame and direction information.