frames

Core frame definitions for the Pipecat AI framework.

This module contains all frame types used throughout the Pipecat pipeline system, including data frames, system frames, and control frames for audio, video, text, and LLM processing.

pipecat.frames.frames.format_pts(pts: int | None)[source]

Format presentation timestamp (PTS) in nanoseconds to a human-readable string.

Converts a PTS value in nanoseconds to a string representation.

Parameters:: pts – Presentation timestamp in nanoseconds, or None if not set.

class pipecat.frames.frames.Frame[source]

Bases: object

Base frame class for all frames in the Pipecat pipeline.

All frames inherit from this base class and automatically receive unique identifiers, names, and metadata support.

Parameters:

id – Unique identifier for the frame instance.
name – Human-readable name combining class name and instance count.
pts – Presentation timestamp in nanoseconds.
broadcast_sibling_id – ID of the paired frame when this frame was broadcast in both directions. Set automatically by broadcast_frame() and broadcast_frame_instance().
metadata – Dictionary for arbitrary frame metadata.
transport_source – Name of the transport source that created this frame.
transport_destination – Name of the transport destination for this frame.

id: int

name: str

pts: int | None

broadcast_sibling_id: int | None

metadata: dict[str, Any]

transport_source: str | None

transport_destination: str | None

class pipecat.frames.frames.SystemFrame[source]

Bases: Frame

System frame class for immediate processing.

A frame that takes higher priority than other frames. System frames are handled in order and are not affected by user interruptions.

class pipecat.frames.frames.DataFrame[source]

Bases: Frame

Data frame class for processing data in order.

A frame that is processed in order and usually contains data such as LLM context, text, audio or images. Data frames are cancelled by user interruptions.

class pipecat.frames.frames.ControlFrame[source]

Bases: Frame

Control frame class for processing control information in order.

A frame that, similar to data frames, is processed in order and usually contains control information such as update settings or to end the pipeline after everything is flushed. Control frames are cancelled by user interruptions.

class pipecat.frames.frames.UninterruptibleFrame[source]

Bases: object

A marker for data or control frames that must not be interrupted.

Frames with this mixin are still ordered normally, but unlike other frames, they are preserved during interruptions: they remain in internal queues and any task processing them will not be cancelled. This ensures the frame is always delivered and processed to completion.

class pipecat.frames.frames.AudioRawFrame(audio: bytes, sample_rate: int, num_channels: int)[source]

Bases: object

A frame containing a chunk of raw audio.

Parameters:

audio – Raw audio bytes in PCM format.
sample_rate – Audio sample rate in Hz.
num_channels – Number of audio channels.
num_frames – Number of audio frames (calculated automatically).

audio: bytes

sample_rate: int

num_channels: int

num_frames: int = 0

class pipecat.frames.frames.ImageRawFrame(image: bytes, size: tuple[int, int], format: str | None)[source]

Bases: object

A frame containing a raw image.

Parameters:

image – Raw image bytes.
size – Image dimensions as (width, height) tuple.
format – Image format (e.g., ‘RGB’, ‘RGBA’).

image: bytes

size: tuple[int, int]

format: str | None

class pipecat.frames.frames.OutputAudioRawFrame(audio: bytes, sample_rate: int, num_channels: int)[source]

Bases: DataFrame, AudioRawFrame

Audio data frame for output to transport.

A chunk of raw audio that will be played by the output transport. If the transport supports multiple audio destinations (e.g. multiple audio tracks) the destination name can be specified in transport_destination.

class pipecat.frames.frames.OutputImageRawFrame(image: bytes, size: tuple[int, int], format: str | None)[source]

Bases: DataFrame, ImageRawFrame

Image data frame for output to transport.

An image that will be shown by the transport. If the transport supports multiple video destinations (e.g. multiple video tracks) the destination name can be specified in transport_destination.

Parameters:: sync_with_audio – If True, the image is queued with audio frames so it is only displayed after all preceding audio has been sent. Defaults to False (image is displayed immediately when the output transport receives it).

sync_with_audio: bool = False

class pipecat.frames.frames.TTSAudioRawFrame(audio: bytes, sample_rate: int, num_channels: int, context_id: str | None = None)[source]

Bases: OutputAudioRawFrame

Audio data frame generated by Text-to-Speech services.

A chunk of output audio generated by a TTS service, ready for playback.

Parameters:: context_id – Unique identifier for the TTS context that generated this audio.

context_id: str | None = None

class pipecat.frames.frames.SpeechOutputAudioRawFrame(audio: bytes, sample_rate: int, num_channels: int)[source]

Bases: OutputAudioRawFrame

An audio frame part of a speech audio stream.

This frame is part of a continuous stream of audio frames containing speech. The audio stream might also contain silence frames, so a process to distinguish between speech and silence might be needed.

class pipecat.frames.frames.URLImageRawFrame(image: bytes, size: tuple[int, int], format: str | None, url: str | None = None)[source]

Bases: OutputImageRawFrame

Image frame with an associated URL.

An output image with an associated URL. These images are usually generated by third-party services that provide a URL to download the image.

Parameters:: url – URL where the image can be downloaded from.

url: str | None = None

class pipecat.frames.frames.SpriteFrame(images: list[OutputImageRawFrame])[source]

Bases: DataFrame

Animated sprite frame containing multiple images.

An animated sprite that will be shown by the transport if the transport’s camera is enabled. Will play at the framerate specified in the transport’s camera_out_framerate constructor parameter.

Parameters:: images – List of image frames that make up the sprite animation.

images: list[OutputImageRawFrame]

class pipecat.frames.frames.TextFrame(text: str)[source]

Bases: DataFrame

Text data frame for passing text through the pipeline.

A chunk of text. Emitted by LLM services, consumed by context aggregators, TTS services and more. Can be used to send text through processors.

Parameters:

text – The text content.
skip_tts – Whether this text should be skipped by the TTS service.
includes_inter_frame_spaces – Whether any necessary inter-frame (leading/trailing) spaces are already included in the text.
append_to_context – Whether this text should be appended to the LLM context. Defaults to True.

text: str

skip_tts: bool | None

includes_inter_frame_spaces: bool

append_to_context: bool

class pipecat.frames.frames.LLMTextFrame(text: str)[source]

Bases: TextFrame

Text frame generated by LLM services.

class pipecat.frames.frames.AggregatedTextFrame(text: str, aggregated_by: AggregationType | str, context_id: str | None = None)[source]

Bases: TextFrame

Text frame representing an aggregation of TextFrames.

This frame contains multiple TextFrames aggregated together for processing or output along with a field to indicate how they are aggregated.

Parameters:

aggregated_by – Method used to aggregate the text frames.
context_id – Unique identifier for the TTS context that generated this text.

aggregated_by: AggregationType | str

context_id: str | None = None

class pipecat.frames.frames.VisionTextFrame(text: str)[source]

Bases: LLMTextFrame

Text frame generated by vision services.

class pipecat.frames.frames.TTSTextFrame(text: str, aggregated_by: AggregationType | str, context_id: str | None = None)[source]

Bases: AggregatedTextFrame

Text frame generated by Text-to-Speech services.

Parameters:: context_id – Unique identifier for the TTS context that generated this text.

context_id: str | None = None

class pipecat.frames.frames.TranscriptionFrame(text: str, user_id: str, timestamp: str, language: Language | None = None, result: Any | None = None, finalized: bool = False)[source]

Bases: TextFrame

Text frame containing speech transcription data.

A text frame with transcription-specific data. The result field contains the result from the STT service if available.

Parameters:

user_id – Identifier for the user who spoke.
timestamp – When the transcription occurred.
language – Detected or specified language of the speech.
result – Raw result from the STT service.
finalized – Whether this is the final transcription for an utterance. Set by STT services that support commit/finalize signals.

user_id: str

timestamp: str

language: Language | None = None

result: Any | None = None

finalized: bool = False

class pipecat.frames.frames.InterimTranscriptionFrame(text: str, user_id: str, timestamp: str, language: Language | None = None, result: Any | None = None)[source]

Bases: TextFrame

Text frame containing partial/interim transcription data.

A text frame with interim transcription-specific data that represents partial results before final transcription. The result field contains the result from the STT service if available.

Parameters:

user_id – Identifier for the user who spoke.
timestamp – When the interim transcription occurred.
language – Detected or specified language of the speech.
result – Raw result from the STT service.

text: str

user_id: str

timestamp: str

language: Language | None = None

result: Any | None = None

class pipecat.frames.frames.TranslationFrame(text: str, user_id: str, timestamp: str, language: Language | None = None)[source]

Bases: TextFrame

Text frame containing translated transcription data.

A text frame with translated transcription data that will be placed in the transport’s receive queue when a participant speaks.

Parameters:

user_id – Identifier for the user who spoke.
timestamp – When the translation occurred.
language – Target language of the translation.

user_id: str

timestamp: str

language: Language | None = None

class pipecat.frames.frames.LLMContextAssistantTimestampFrame(timestamp: str)[source]

Bases: DataFrame

Timestamp information for assistant messages in LLM context.

Parameters:: timestamp – Timestamp when the assistant message was created.

timestamp: str

class pipecat.frames.frames.LLMContextFrame(context: LLMContext)[source]

Bases: Frame

Frame containing a universal LLM context.

Used as a signal to LLM services to ingest the provided context and generate a response based on it.

Parameters:: context – The LLM context containing messages, tools, and configuration.

context: LLMContext

class pipecat.frames.frames.LLMThoughtStartFrame(append_to_context: bool = False, llm: str | None = None)[source]

Bases: ControlFrame

Frame indicating the start of an LLM thought.

Parameters:

append_to_context – Whether the thought should be appended to the LLM context. If it is appended, the llm field is required, since it will be appended as an LLMSpecificMessage.
llm – Optional identifier of the LLM provider for LLM-specific handling. Only required if append_to_context is True, as the thought is appended to context as an LLMSpecificMessage.

append_to_context: bool = False

llm: str | None = None

class pipecat.frames.frames.LLMThoughtTextFrame(text: str)[source]

Bases: DataFrame

Frame containing the text (or text chunk) of an LLM thought.

Note that despite this containing text, it is a DataFrame and not a TextFrame, to avoid most typical text processing, such as TTS.

Parameters:: text – The text (or text chunk) of the thought.

text: str

includes_inter_frame_spaces: bool

class pipecat.frames.frames.LLMThoughtEndFrame(signature: Any = None)[source]

Bases: ControlFrame

Frame indicating the end of an LLM thought.

Parameters:: signature – Optional signature associated with the thought. This is used by Anthropic, which includes a signature at the end of each thought.

signature: Any = None

class pipecat.frames.frames.LLMRunFrame[source]

Bases: DataFrame

Frame to trigger LLM processing with current context.

A frame that instructs the LLM service to process the current context and generate a response.

class pipecat.frames.frames.LLMMessagesAppendFrame(messages: list[LLMContextMessage], run_llm: bool | None = None)[source]

Bases: DataFrame

Frame containing LLM messages to append to current context.

A frame containing a list of LLM messages that need to be added to the current context.

Parameters:

messages – List of context messages to append.
run_llm – Whether the context update should be sent to the LLM.

messages: list[LLMContextMessage]

run_llm: bool | None = None

class pipecat.frames.frames.LLMMessagesUpdateFrame(messages: list[LLMContextMessage], run_llm: bool | None = None)[source]

Bases: DataFrame

Frame containing LLM messages to replace current context.

A frame containing a list of new LLM messages to replace the current context LLM messages.

Parameters:

messages – List of context messages to replace current context.
run_llm – Whether the context update should be sent to the LLM.

messages: list[LLMContextMessage]

run_llm: bool | None = None

class pipecat.frames.frames.LLMMessagesTransformFrame(transform: Callable[[list[LLMContextMessage]], list[LLMContextMessage]], run_llm: bool | None = None)[source]

Bases: DataFrame

Frame containing a transform function to modify the current context’s LLM messages.

A frame containing a transform function that takes the context’s current list of LLM messages and returns a modified list.

Parameters:

transform – A function that takes a list of messages and returns a modified list.
run_llm – Whether the context update should be sent to the LLM.

transform: Callable[[list[LLMContextMessage]], list[LLMContextMessage]]

run_llm: bool | None = None

class pipecat.frames.frames.LLMSetToolsFrame(tools: list[dict] | ToolsSchema | NotGiven)[source]

Bases: DataFrame

Frame containing tools for LLM function calling.

A frame containing a list of tools for an LLM to use for function calling. The specific format depends on the LLM being used, but it should typically contain JSON Schema objects.

Parameters:: tools – List of tool/function definitions for the LLM.

tools: list[dict] | ToolsSchema | NotGiven

class pipecat.frames.frames.LLMSetToolChoiceFrame(tool_choice: Literal['none', 'auto', 'required'] | dict)[source]

Bases: DataFrame

Frame containing tool choice configuration for LLM function calling.

Parameters:: tool_choice – Tool choice setting - ‘none’, ‘auto’, ‘required’, or specific tool dict.

tool_choice: Literal['none', 'auto', 'required'] | dict

class pipecat.frames.frames.LLMEnablePromptCachingFrame(enable: bool)[source]

Bases: DataFrame

Frame to enable/disable prompt caching in LLMs.

Parameters:: enable – Whether to enable prompt caching.

enable: bool

class pipecat.frames.frames.LLMConfigureOutputFrame(skip_tts: bool)[source]

Bases: DataFrame

Frame to configure LLM output.

This frame is used to configure how the LLM produces output. For example, it can tell the LLM to generate tokens that should be added to the context but not spoken by the TTS service (if one is present in the pipeline).

Parameters:: skip_tts – Whether LLM tokens should skip the TTS service (if any).

skip_tts: bool

class pipecat.frames.frames.FunctionCallResultProperties(run_llm: bool | None = None, on_context_updated: Callable[[], Awaitable[None]] | None = None, is_final: bool = True)[source]

Bases: object

Properties for configuring function call result behavior.

Parameters:

run_llm – Whether to run the LLM after receiving this result.
on_context_updated – Callback to execute when context is updated.
is_final – Whether this is the final result for the function call. When False the result is treated as an intermediate update. Defaults to True. Only meaningful for async function calls (cancel_on_interruption=False).

run_llm: bool | None = None

on_context_updated: Callable[[], Awaitable[None]] | None = None

is_final: bool = True

class pipecat.frames.frames.FunctionCallResultFrame(function_name: str, tool_call_id: str, arguments: Any, result: Any, run_llm: bool | None = None, properties: FunctionCallResultProperties | None = None)[source]

Bases: DataFrame, UninterruptibleFrame

Frame containing the result of an LLM function call.

This is an uninterruptible frame because once a result is generated we always want to update the context.

Parameters:

function_name – Name of the function that was executed.
tool_call_id – Unique identifier for the function call.
arguments – Arguments that were passed to the function.
result – The result returned by the function.
run_llm – Whether to run the LLM after this result.
properties – Additional properties for result handling.

function_name: str

tool_call_id: str

arguments: Any

result: Any

run_llm: bool | None = None

properties: FunctionCallResultProperties | None = None

class pipecat.frames.frames.TTSSpeakFrame(text: str, append_to_context: bool | None = None)[source]

Bases: DataFrame

Frame containing text that should be spoken by TTS.

A frame that contains text that should be spoken by the TTS service in the pipeline (if any).

Parameters:

text – The text to be spoken.
append_to_context – Whether to append the text to the context.

text: str

append_to_context: bool | None = None

class pipecat.frames.frames.OutputTransportMessageFrame(message: Any)[source]

Bases: DataFrame

Frame containing transport-specific message data.

Parameters:: message – The transport message payload.

message: Any

class pipecat.frames.frames.DTMFFrame[source]

Bases: object

Marker base class for DTMF (Dual-Tone Multi-Frequency) keypad frames.

Used only as a shared tag so that both input and output DTMF frames can be identified via isinstance(frame, DTMFFrame). The concrete frames define their own fields.

class pipecat.frames.frames.OutputDTMFFrame(button: KeypadEntry | None = None, buttons: list[KeypadEntry] | None = None)[source]

Bases: DTMFFrame, DataFrame

DTMF keypress output frame for transport queuing.

Parameters:

button – Convenience shortcut for sending a single DTMF keypad entry. Equivalent to buttons=[button]. If both buttons and button are provided, buttons takes precedence.
buttons – Sequence of one or more DTMF keypad buttons to send. Use from_string() to build this from a string like "123#".

button: KeypadEntry | None = None

buttons: list[KeypadEntry] | None = None

classmethod from_string(buttons: str, **kwargs) → OutputDTMFFrame[source]

Build an OutputDTMFFrame from a string of DTMF characters.

Parameters:

buttons – A string like "123#". Each character must be a valid KeypadEntry value.
**kwargs – Additional keyword arguments forwarded to the frame constructor.

Returns:

A frame of type cls with buttons populated as a list of KeypadEntry.

to_string() → str[source]

Return the frame’s buttons as a dial string.

Returns:: A string such as "123#" formed by concatenating the values of each KeypadEntry in buttons, or an empty string if buttons is not set.

class pipecat.frames.frames.StartFrame(audio_in_sample_rate: int = 16000, audio_out_sample_rate: int = 24000, enable_metrics: bool = False, enable_tracing: bool = False, enable_usage_metrics: bool = False, report_only_initial_ttfb: bool = False, tracing_context: TracingContext | None = None)[source]

Bases: SystemFrame

Initial frame to start pipeline processing.

This is the first frame that should be pushed down a pipeline to initialize all processors with their configuration parameters.

Parameters:

audio_in_sample_rate – Input audio sample rate in Hz.
audio_out_sample_rate – Output audio sample rate in Hz.
enable_metrics – Whether to enable performance metrics collection.
enable_tracing – Whether to enable OpenTelemetry tracing.
enable_usage_metrics – Whether to enable usage metrics collection.
report_only_initial_ttfb – Whether to report only initial time-to-first-byte.
tracing_context – Pipeline-scoped tracing context for span hierarchy.

audio_in_sample_rate: int = 16000

audio_out_sample_rate: int = 24000

enable_metrics: bool = False

enable_tracing: bool = False

enable_usage_metrics: bool = False

report_only_initial_ttfb: bool = False

tracing_context: TracingContext | None = None

class pipecat.frames.frames.CancelFrame(reason: Any | None = None)[source]

Bases: SystemFrame

Frame indicating pipeline should stop immediately.

Indicates that a pipeline needs to stop right away without processing remaining queued frames.

Parameters:: reason – Optional reason for pushing a cancel frame.

reason: Any | None = None

class pipecat.frames.frames.ErrorFrame(error: str, fatal: bool = False, processor: FrameProcessor | None = None, exception: Exception | None = None)[source]

Bases: SystemFrame

Frame notifying of errors in the pipeline.

This is used to notify upstream that an error has occurred downstream in the pipeline. A fatal error indicates the error is unrecoverable and that the bot should exit.

Parameters:

error – Description of the error that occurred.
fatal – Whether the error is fatal and requires bot shutdown.
processor – The frame processor that generated the error.
exception – The exception that occurred.

error: str

fatal: bool = False

processor: FrameProcessor | None = None

exception: Exception | None = None

class pipecat.frames.frames.FatalErrorFrame(error: str, processor: FrameProcessor | None = None, exception: Exception | None = None)[source]

Bases: ErrorFrame

Frame notifying of unrecoverable errors requiring bot shutdown.

This is used to notify upstream that an unrecoverable error has occurred and that the bot should exit immediately.

Parameters:: fatal – Always True for fatal errors.

fatal: bool = True

class pipecat.frames.frames.FrameProcessorPauseUrgentFrame(processor: FrameProcessor)[source]

Bases: SystemFrame

Frame to pause frame processing immediately.

This frame is used to pause frame processing for the given processor as fast as possible. Pausing frame processing will keep frames in the internal queue which will then be processed when frame processing is resumed with FrameProcessorResumeFrame.

Parameters:: processor – The frame processor to pause.

processor: FrameProcessor

class pipecat.frames.frames.FrameProcessorResumeUrgentFrame(processor: FrameProcessor)[source]

Bases: SystemFrame

Frame to resume frame processing immediately.

This frame is used to resume frame processing for the given processor if it was previously paused as fast as possible. After resuming frame processing all queued frames will be processed in the order received.

Parameters:: processor – The frame processor to resume.

processor: FrameProcessor

class pipecat.frames.frames.InterruptionFrame[source]

Bases: SystemFrame

Frame pushed to interrupt the pipeline.

This frame is used to interrupt the pipeline. For example, when a user starts speaking to cancel any in-progress bot output. It can also be pushed by any processor.

class pipecat.frames.frames.UserStartedSpeakingFrame[source]

Bases: SystemFrame

Frame indicating that the user turn has started.

Emitted when the user turn starts, which usually means that some transcriptions are already available.

class pipecat.frames.frames.UserStoppedSpeakingFrame[source]

Bases: SystemFrame

Frame indicating that the user turn has ended.

Emitted when the user turn ends. This usually coincides with the start of the bot turn.

class pipecat.frames.frames.UserMuteStartedFrame[source]

Bases: SystemFrame

Frame indicating that the user has been muted.

Emitted when a mute strategy activates, suppressing user frames (audio, transcription, interruption) from propagating through the pipeline.

class pipecat.frames.frames.UserMuteStoppedFrame[source]

Bases: SystemFrame

Frame indicating that the user has been unmuted.

Emitted when a mute strategy deactivates, allowing user frames to propagate through the pipeline again.

class pipecat.frames.frames.UserSpeakingFrame[source]

Bases: SystemFrame

Frame indicating the user is speaking.

Emitted by VAD to indicate the user is speaking.

class pipecat.frames.frames.VADUserStartedSpeakingFrame(start_secs: float = 0.0, timestamp: float = <factory>)[source]

Bases: SystemFrame

Frame emitted when VAD definitively detects user started speaking.

Parameters:

start_secs – The VAD start_secs duration that was used to confirm the user started speaking. This represents the speech duration that had to elapse before the VAD determined speech began.
timestamp – Wall-clock time when the VAD made its determination.

start_secs: float = 0.0

timestamp: float

class pipecat.frames.frames.VADUserStoppedSpeakingFrame(stop_secs: float = 0.0, timestamp: float = <factory>)[source]

Bases: SystemFrame

Frame emitted when VAD definitively detects user stopped speaking.

Parameters:

stop_secs – The VAD stop_secs duration that was used to confirm the user stopped speaking. This represents the silence duration that had to elapse before the VAD determined speech ended.
timestamp – Wall-clock time when the VAD made its determination.

stop_secs: float = 0.0

timestamp: float

class pipecat.frames.frames.BotStartedSpeakingFrame[source]

Bases: SystemFrame

Frame indicating the bot started speaking.

Emitted upstream and downstream by the BaseTransportOutput to indicate the bot started speaking.

class pipecat.frames.frames.BotStoppedSpeakingFrame[source]

Bases: SystemFrame

Frame indicating the bot stopped speaking.

Emitted upstream and downstream by the BaseTransportOutput to indicate the bot stopped speaking.

class pipecat.frames.frames.BotSpeakingFrame[source]

Bases: SystemFrame

Frame indicating the bot is currently speaking.

Emitted upstream and downstream by the BaseOutputTransport while the bot is still speaking. This can be used, for example, to detect when a user is idle. That is, while the bot is speaking we don’t want to trigger any user idle timeout since the user might be listening.

class pipecat.frames.frames.MetricsFrame(data: list[MetricsData])[source]

Bases: SystemFrame

Frame containing performance metrics data.

Emitted by processors that can compute metrics like latencies.

Parameters:: data – List of metrics data collected by the processor.

data: list[MetricsData]

class pipecat.frames.frames.FunctionCallFromLLM(function_name: str, tool_call_id: str, arguments: Mapping[str, Any], context: Any)[source]

Bases: object

Represents a function call returned by the LLM.

Represents a function call returned by the LLM to be registered for execution.

Parameters:

function_name – The name of the function to call.
tool_call_id – A unique identifier for the function call.
arguments – The arguments to pass to the function.
context – The LLM context when the function call was made.

function_name: str

tool_call_id: str

arguments: Mapping[str, Any]

context: Any

class pipecat.frames.frames.FunctionCallsStartedFrame(function_calls: Sequence[FunctionCallFromLLM])[source]

Bases: SystemFrame

Frame signaling that function call execution is starting.

A frame signaling that one or more function call execution is going to start.

Parameters:: function_calls – Sequence of function calls that will be executed.

function_calls: Sequence[FunctionCallFromLLM]

class pipecat.frames.frames.FunctionCallCancelFrame(function_name: str, tool_call_id: str)[source]

Bases: SystemFrame

Frame signaling that a function call has been cancelled.

Parameters:

function_name – Name of the function that was cancelled.
tool_call_id – Unique identifier for the cancelled function call.

function_name: str

tool_call_id: str

class pipecat.frames.frames.STTMuteFrame(mute: bool)[source]

Bases: SystemFrame

Frame to mute/unmute the Speech-to-Text service.

Parameters:: mute – Whether to mute (True) or unmute (False) the STT service.

mute: bool

class pipecat.frames.frames.InputTransportMessageFrame(message: Any)[source]

Bases: SystemFrame

Frame for transport messages received from external sources.

Parameters:: message – The urgent transport message payload.

message: Any

class pipecat.frames.frames.OutputTransportMessageUrgentFrame(message: Any)[source]

Bases: SystemFrame

Frame for urgent transport messages that need to be sent immediately.

Parameters:: message – The urgent transport message payload.

message: Any

Bases: SystemFrame

Frame requesting an image from a specific user.

A frame to request an image from the given user. The request might come with a text that can be later used to describe the requested image.

Parameters:

user_id – Identifier of the user to request image from.
text – An optional text associated to the image request.
append_to_context – Whether the requested image should be appended to the LLM context.
video_source – Specific video source to capture from.
function_name – Name of function that generated this request (if any).
tool_call_id – Tool call ID if generated by function call (if any).
result_callback – Optional callback to invoke when the image is retrieved.

user_id: str

text: str | None = None

append_to_context: bool | None = None

video_source: str | None = None

function_name: str | None = None

tool_call_id: str | None = None

result_callback: Any | None = None

class pipecat.frames.frames.InputAudioRawFrame(audio: bytes, sample_rate: int, num_channels: int)[source]

Bases: SystemFrame, AudioRawFrame

Raw audio input frame from transport.

A chunk of audio usually coming from an input transport. If the transport supports multiple audio sources (e.g. multiple audio tracks) the source name will be specified in transport_source.

class pipecat.frames.frames.InputImageRawFrame(image: bytes, size: tuple[int, int], format: str | None)[source]

Bases: SystemFrame, ImageRawFrame

Raw image input frame from transport.

An image usually coming from an input transport. If the transport supports multiple video sources (e.g. multiple video tracks) the source name will be specified in transport_source.

class pipecat.frames.frames.InputTextRawFrame(text: str)[source]

Bases: SystemFrame, TextFrame

Raw text input frame from transport.

Text input usually coming from user typing or programmatic text injection that should be sent to LLM services as input, similar to how InputAudioRawFrame and InputImageRawFrame represent user audio and video input.

class pipecat.frames.frames.UserAudioRawFrame(audio: bytes, sample_rate: int, num_channels: int, user_id: str = '')[source]

Bases: InputAudioRawFrame

Raw audio input frame associated with a specific user.

A chunk of audio, usually coming from an input transport, associated to a user.

Parameters:: user_id – Identifier of the user who provided this audio.

user_id: str = ''

class pipecat.frames.frames.UserImageRawFrame(image: bytes, size: tuple[int, int], format: str | None, user_id: str = '', text: str | None = None, append_to_context: bool | None = None, request: UserImageRequestFrame | None = None)[source]

Bases: InputImageRawFrame

Raw image input frame associated with a specific user.

An image associated to a user, potentially in response to an image request.

Parameters:

user_id – Identifier of the user who provided this image.
text – An optional text associated to this image.
append_to_context – Whether the requested image should be appended to the LLM context.
request – The original image request frame if this is a response.

user_id: str = ''

text: str | None = None

append_to_context: bool | None = None

request: UserImageRequestFrame | None = None

class pipecat.frames.frames.AssistantImageRawFrame(image: bytes, size: tuple[int, int], format: str | None, original_data: bytes | None = None, original_mime_type: str | None = None)[source]

Bases: OutputImageRawFrame

Frame containing an image generated by the assistant.

Contains both the raw frame for display (superclass functionality) as well as the original image, which can get used directly in LLM contexts.

Parameters:

original_data – The original image data, which can get used directly in an LLM context message without further encoding.
original_mime_type – The MIME type of the original image data.

original_data: bytes | None = None

original_mime_type: str | None = None

class pipecat.frames.frames.InputDTMFFrame(button: KeypadEntry)[source]

Bases: DTMFFrame, SystemFrame

DTMF keypress input frame from transport.

Parameters:: button – The DTMF keypad entry that was pressed.

button: KeypadEntry

class pipecat.frames.frames.OutputDTMFUrgentFrame(button: KeypadEntry | None = None, buttons: list[KeypadEntry] | None = None)[source]

Bases: DTMFFrame, SystemFrame

DTMF keypress output frame for immediate sending.

Parameters:

button – Convenience shortcut for sending a single DTMF keypad entry. Equivalent to buttons=[button]. If both buttons and button are provided, buttons takes precedence.
buttons – Sequence of one or more DTMF keypad buttons to send. Use from_string() to build this from a string like "123#".

button: KeypadEntry | None = None

buttons: list[KeypadEntry] | None = None

classmethod from_string(buttons: str, **kwargs) → OutputDTMFUrgentFrame[source]

Build an OutputDTMFUrgentFrame from a string of DTMF characters.

Parameters:

buttons – A string like "123#". Each character must be a valid KeypadEntry value.
**kwargs – Additional keyword arguments forwarded to the frame constructor.

Returns:

A frame of type cls with buttons populated as a list of KeypadEntry.

to_string() → str[source]

Return the frame’s buttons as a dial string.

Returns:: A string such as "123#" formed by concatenating the values of each KeypadEntry in buttons, or an empty string if buttons is not set.

class pipecat.frames.frames.SpeechControlParamsFrame(vad_params: VADParams | None = None, turn_params: BaseTurnParams | None = None)[source]

Bases: SystemFrame

Frame for notifying processors of speech control parameter changes.

This includes parameters for both VAD (Voice Activity Detection) and turn-taking analysis. It allows downstream processors to adjust their behavior based on updated interaction control settings.

Parameters:

vad_params – Current VAD parameters.
turn_params – Current turn-taking analysis parameters.

vad_params: VADParams | None = None

turn_params: BaseTurnParams | None = None

class pipecat.frames.frames.ServiceMetadataFrame(service_name: str)[source]

Bases: SystemFrame

Base metadata frame for services.

Broadcast by services at pipeline start to share service-specific configuration and performance characteristics with downstream processors.

Parameters:: service_name – The name of the service broadcasting this metadata.

service_name: str

class pipecat.frames.frames.STTMetadataFrame(service_name: str, ttfs_p99_latency: float)[source]

Bases: ServiceMetadataFrame

Metadata from STT service.

Broadcast by STT services to inform downstream processors (like turn strategies) about STT latency characteristics.

Parameters:: ttfs_p99_latency – Time to final segment P99 latency in seconds. This is the expected time from when speech ends to when the final transcript is received, at the 99th percentile.

ttfs_p99_latency: float

class pipecat.frames.frames.ServiceSwitcherRequestMetadataFrame(service: FrameProcessor)[source]

Bases: ControlFrame

Request a service to re-emit its metadata frames.

Used by ServiceSwitcher when switching active services to ensure downstream processors receive updated metadata from the newly active service. Services that receive this frame should re-push their metadata frame (e.g., STTMetadataFrame for STT services).

Parameters:: service – The target service that should re-emit its metadata.

service: FrameProcessor

class pipecat.frames.frames.TaskFrame[source]

Bases: ControlFrame

Base frame for task frames.

This is a base class for frames that are meant to be sent and handled upstream by the pipeline task. This might result in a corresponding frame sent downstream (e.g. InterruptionTaskFrame / InterruptionFrame or EndTaskFrame / EndFrame).

class pipecat.frames.frames.TaskSystemFrame[source]

Bases: SystemFrame

Base frame for task system frames.

This is a base class for frames that are meant to be sent and handled upstream by the pipeline task. This might result in a corresponding frame sent downstream (e.g. InterruptionTaskFrame / InterruptionFrame or EndTaskFrame / EndFrame).

class pipecat.frames.frames.EndTaskFrame(reason: Any | None = None)[source]

Bases: TaskFrame, UninterruptibleFrame

Frame to request graceful pipeline task closure.

This is used to notify the pipeline task that the pipeline should be closed nicely (flushing all the queued frames) by pushing an EndFrame downstream. This frame should be pushed upstream.

Parameters:: reason – Optional reason for pushing an end frame.

reason: Any | None = None

class pipecat.frames.frames.StopTaskFrame[source]

Bases: TaskFrame, UninterruptibleFrame

Frame to request pipeline task stop while keeping processors running.

This is used to notify the pipeline task that it should be stopped as soon as possible (flushing all the queued frames) but that the pipeline processors should be kept in a running state. This frame should be pushed upstream.

class pipecat.frames.frames.CancelTaskFrame(reason: Any | None = None)[source]

Bases: TaskSystemFrame

Frame to request immediate pipeline task cancellation.

This is used to notify the pipeline task that the pipeline should be stopped immediately by pushing a CancelFrame downstream. This frame should be pushed upstream.

Parameters:: reason – Optional reason for pushing a cancel frame.

reason: Any | None = None

class pipecat.frames.frames.InterruptionTaskFrame[source]

Bases: TaskSystemFrame

Frame indicating the pipeline should be interrupted.

This frame should be pushed upstream to indicate the pipeline should be interrupted. The pipeline task converts this into an InterruptionFrame and sends it downstream.

class pipecat.frames.frames.EndFrame(reason: Any | None = None)[source]

Bases: ControlFrame, UninterruptibleFrame

Frame indicating pipeline has ended and should shut down.

Indicates that a pipeline has ended and frame processors and pipelines should be shut down. If the transport receives this frame, it will stop sending frames to its output channel(s) and close all its threads. Note, that this is a control frame, which means it will be received in the order it was sent.

This frame is marked as UninterruptibleFrame to ensure it is not lost when an InterruptionFrame is processed. Terminal frames must survive interruption to guarantee proper pipeline shutdown.

Parameters:: reason – Optional reason for pushing an end frame.

reason: Any | None = None

class pipecat.frames.frames.StopFrame[source]

Bases: ControlFrame, UninterruptibleFrame

Frame indicating pipeline should stop but keep processors running.

Indicates that a pipeline should be stopped but that the pipeline processors should be kept in a running state. This is normally queued from the pipeline task.

This frame is marked as UninterruptibleFrame to ensure it is not lost when an InterruptionFrame is processed. Terminal frames must survive interruption to guarantee proper pipeline control.

class pipecat.frames.frames.BotConnectedFrame[source]

Bases: SystemFrame

Frame indicating the bot has connected to the transport service.

Pushed downstream by SFU transports (Daily, LiveKit, HeyGen, Tavus) when the bot successfully joins the room. Non-SFU transports do not emit this frame.

class pipecat.frames.frames.ClientConnectedFrame[source]

Bases: SystemFrame

Frame indicating that a client has connected to the transport.

Pushed downstream by the input transport when a client (participant) connects. Used by observers to measure transport readiness timing.

class pipecat.frames.frames.OutputTransportReadyFrame[source]

Bases: ControlFrame

Frame indicating that the output transport is ready.

Indicates that the output transport is ready and able to receive frames.

class pipecat.frames.frames.HeartbeatFrame(timestamp: int)[source]

Bases: ControlFrame

Frame used by pipeline task to monitor pipeline health.

This frame is used by the pipeline task as a mechanism to know if the pipeline is running properly.

Parameters:: timestamp – Timestamp when the heartbeat was generated.

timestamp: int

class pipecat.frames.frames.FrameProcessorPauseFrame(processor: FrameProcessor)[source]

Bases: ControlFrame

Frame to pause frame processing for a specific processor.

This frame is used to pause frame processing for the given processor. Pausing frame processing will keep frames in the internal queue which will then be processed when frame processing is resumed with FrameProcessorResumeFrame.

Parameters:: processor – The frame processor to pause.

processor: FrameProcessor

class pipecat.frames.frames.FrameProcessorResumeFrame(processor: FrameProcessor)[source]

Bases: ControlFrame

Frame to resume frame processing for a specific processor.

This frame is used to resume frame processing for the given processor if it was previously paused. After resuming frame processing all queued frames will be processed in the order received.

Parameters:: processor – The frame processor to resume.

processor: FrameProcessor

class pipecat.frames.frames.LLMFullResponseStartFrame[source]

Bases: ControlFrame

Frame indicating the beginning of an LLM response.

Used to indicate the beginning of an LLM response. Followed by one or more TextFrames and a final LLMFullResponseEndFrame.

skip_tts: bool | None

class pipecat.frames.frames.LLMFullResponseEndFrame[source]

Bases: ControlFrame

Frame indicating the end of an LLM response.

skip_tts: bool | None

class pipecat.frames.frames.LLMAssistantPushAggregationFrame[source]

Bases: ControlFrame

Frame that forces the LLM assistant aggregator to push its current aggregation to context.

When received by LLMAssistantAggregator, any text that has been accumulated in the aggregation buffer is immediately committed to the conversation context as an assistant message, without waiting for an LLMFullResponseEndFrame.

class pipecat.frames.frames.LLMSummarizeContextFrame(config: LLMContextSummaryConfig | None = None)[source]

Bases: ControlFrame

Frame requesting on-demand context summarization.

Push this frame into the pipeline to trigger a manual context summarization.

Parameters:: config – Optional per-request override for summary generation settings (prompt, token budget, messages to keep). If None, the summarizer’s default LLMContextSummaryConfig is used.

config: LLMContextSummaryConfig | None = None

class pipecat.frames.frames.LLMContextSummaryRequestFrame(request_id: str, context: LLMContext, min_messages_to_keep: int, target_context_tokens: int, summarization_prompt: str, summarization_timeout: float | None = None)[source]

Bases: ControlFrame

Frame requesting context summarization from an LLM service.

Sent by aggregators to LLM services when conversation context needs to be compressed. The LLM service generates a summary of older messages while preserving recent conversation history.

Parameters:

request_id – Unique identifier to match this request with its response. Used to handle async responses and avoid race conditions.
context – The full LLM context containing all messages to analyze and summarize.
min_messages_to_keep – Number of recent messages to preserve uncompressed. These messages will not be included in the summary.
target_context_tokens – Maximum token size for the generated summary. This value is passed directly to the LLM as the max_tokens parameter when generating the summary text.
summarization_prompt – System prompt instructing the LLM how to generate the summary.
summarization_timeout – Maximum time in seconds for the LLM to generate a summary. When None, a default timeout of 120s is applied.

request_id: str

context: LLMContext

min_messages_to_keep: int

target_context_tokens: int

summarization_prompt: str

summarization_timeout: float | None = None

class pipecat.frames.frames.LLMContextSummaryResultFrame(request_id: str, summary: str, last_summarized_index: int, error: str | None = None)[source]

Bases: ControlFrame, UninterruptibleFrame

Frame containing the result of context summarization.

Sent by LLM services back to aggregators after generating a summary. Contains the formatted summary message and metadata about what was summarized.

Parameters:

request_id – Identifier matching the original request. Used to correlate async responses.
summary – The formatted summary message ready to be inserted into context.
last_summarized_index – Index (0-based) of the last message that was included in the summary. Messages after this index are preserved.
error – Error message if summarization failed, None on success.

request_id: str

summary: str

last_summarized_index: int

error: str | None = None

class pipecat.frames.frames.FunctionCallInProgressFrame(function_name: str, tool_call_id: str, arguments: Any, cancel_on_interruption: bool = False, group_id: str | None = None)[source]

Bases: ControlFrame, UninterruptibleFrame

Frame signaling that a function call is currently executing.

This is an uninterruptible frame because we always want to update the context.

Parameters:

function_name – Name of the function being executed.
tool_call_id – Unique identifier for this function call.
arguments – Arguments passed to the function.
cancel_on_interruption – Whether to cancel this call if interrupted. When False the call is treated as asynchronous: the LLM continues the conversation immediately without waiting for the result, and the result is injected later via a developer message.
group_id – Identifier shared by all function calls originating from the same LLM response batch. Used to determine when the last call in a group completes so the LLM can be triggered exactly once.

function_name: str

tool_call_id: str

arguments: Any

cancel_on_interruption: bool = False

group_id: str | None = None

class pipecat.frames.frames.VisionFullResponseStartFrame[source]

Bases: LLMFullResponseStartFrame

Frame indicating the beginning of a vision model response.

Used to indicate the beginning of a vision model response. Followed by one or more VisionTextFrames and a final VisionFullResponseEndFrame.

class pipecat.frames.frames.VisionFullResponseEndFrame[source]

Bases: LLMFullResponseEndFrame

Frame indicating the end of a Vision model response.

class pipecat.frames.frames.TTSStartedFrame(context_id: str | None = None)[source]

Bases: ControlFrame

Frame indicating the beginning of a TTS response.

Used to indicate the beginning of a TTS response. Following TTSAudioRawFrames are part of the TTS response until a TTSStoppedFrame. These frames can be used for aggregating audio frames in a transport to optimize the size of frames sent to the session, without needing to control this in the TTS service.

Parameters:: context_id – Unique identifier for this TTS context.

context_id: str | None = None

class pipecat.frames.frames.TTSStoppedFrame(context_id: str | None = None)[source]

Bases: ControlFrame

Frame indicating the end of a TTS response.

Parameters:: context_id – Unique identifier for this TTS context.

context_id: str | None = None

class pipecat.frames.frames.ServiceUpdateSettingsFrame(settings: Mapping[str, Any]=<factory>, delta: ServiceSettings | None = None, service: FrameProcessor | None = None)[source]

Bases: ControlFrame, UninterruptibleFrame

Base frame for updating service settings.

Supports both a settings dict (for backward compatibility) and a delta object. When both are provided, delta takes precedence.

Parameters:

settings –
Dictionary of setting name to value mappings.

Deprecated since version 0.0.104: Use delta with a typed settings object instead.
delta – ServiceSettings delta-mode object describing the fields to change.
service – Optional target service instance. When provided, only that service will apply the settings; other services will forward the frame unchanged.

settings: Mapping[str, Any]

delta: ServiceSettings | None = None

service: FrameProcessor | None = None

class pipecat.frames.frames.LLMUpdateSettingsFrame(settings: Mapping[str, Any]=<factory>, delta: ServiceSettings | None = None, service: FrameProcessor | None = None)[source]

Bases: ServiceUpdateSettingsFrame

Frame for updating LLM service settings.

class pipecat.frames.frames.TTSUpdateSettingsFrame(settings: Mapping[str, Any]=<factory>, delta: ServiceSettings | None = None, service: FrameProcessor | None = None)[source]

Bases: ServiceUpdateSettingsFrame

Frame for updating TTS service settings.

class pipecat.frames.frames.STTUpdateSettingsFrame(settings: Mapping[str, Any]=<factory>, delta: ServiceSettings | None = None, service: FrameProcessor | None = None)[source]

Bases: ServiceUpdateSettingsFrame

Frame for updating STT service settings.

class pipecat.frames.frames.UserIdleTimeoutUpdateFrame(timeout: float)[source]

Bases: SystemFrame

Frame for updating the user idle timeout at runtime.

Setting timeout to 0 disables idle detection. Setting a positive value enables it.

Parameters:: timeout – The new idle timeout in seconds. 0 disables idle detection.

timeout: float

class pipecat.frames.frames.VADParamsUpdateFrame(params: VADParams)[source]

Bases: ControlFrame

Frame for updating VAD parameters.

A control frame containing a request to update VAD params. Intended to be pushed upstream from RTVI processor.

Parameters:: params – New VAD parameters to apply.

params: VADParams

class pipecat.frames.frames.FilterControlFrame[source]

Bases: ControlFrame

Base control frame for audio filter operations.

class pipecat.frames.frames.FilterUpdateSettingsFrame(settings: Mapping[str, Any])[source]

Bases: FilterControlFrame

Frame for updating audio filter settings.

Parameters:: settings – Dictionary of filter setting name to value mappings.

settings: Mapping[str, Any]

class pipecat.frames.frames.FilterEnableFrame(enable: bool)[source]

Bases: FilterControlFrame

Frame for enabling/disabling audio filters at runtime.

Parameters:: enable – Whether to enable (True) or disable (False) the filter.

enable: bool

class pipecat.frames.frames.MixerControlFrame[source]

Bases: ControlFrame

Base control frame for audio mixer operations.

class pipecat.frames.frames.MixerUpdateSettingsFrame(settings: Mapping[str, Any])[source]

Bases: MixerControlFrame

Frame for updating audio mixer settings.

Parameters:: settings – Dictionary of mixer setting name to value mappings.

settings: Mapping[str, Any]

class pipecat.frames.frames.MixerEnableFrame(enable: bool)[source]

Bases: MixerControlFrame

Frame for enabling/disabling audio mixer at runtime.

Parameters:: enable – Whether to enable (True) or disable (False) the mixer.

enable: bool

class pipecat.frames.frames.ServiceSwitcherFrame[source]

Bases: ControlFrame

A base class for frames that affect ServiceSwitcher behavior.

class pipecat.frames.frames.ManuallySwitchServiceFrame(service: FrameProcessor)[source]

Bases: ServiceSwitcherFrame

A frame to request a manual switch in the active service in a ServiceSwitcher.

Handled by ServiceSwitcherStrategyManual to switch the active service.

service: FrameProcessor