rtvi

RTVI (Real-Time Voice Interface) protocol implementation for Pipecat.

class pipecat.processors.frameworks.rtvi.RTVIClientMessageFrame(msg_id: str, type: str, data: Any | None = None)[source]

Bases: SystemFrame

A frame for sending messages from the client to the RTVI server.

This frame is meant for custom messaging from the client to the server and expects a server-response message.

msg_id: str
type: str
data: Any | None = None
class pipecat.processors.frameworks.rtvi.RTVIFunctionCallReportLevel(*values)[source]

Bases: StrEnum

Level of detail to include in function call RTVI events.

Controls what information is exposed in function call events for security.

Values:

DISABLED: No events emitted for this function call. NONE: Events only with tool_call_id, no function name or metadata (most secure). NAME: Events with function name, no arguments or results. FULL: Events with function name, arguments, and results.

DISABLED = 'disabled'
NONE = 'none'
NAME = 'name'
FULL = 'full'
class pipecat.processors.frameworks.rtvi.RTVIObserver(rtvi: RTVIProcessor | None = None, *, params: RTVIObserverParams | None = None, **kwargs)[source]

Bases: BaseObserver

Pipeline frame observer for RTVI server message handling.

This observer monitors pipeline frames and converts them into appropriate RTVI messages for client communication. It handles various frame types including speech events, transcriptions, LLM responses, and TTS events.

Note

This observer only handles outgoing messages. Incoming RTVI client messages are handled by the RTVIProcessor.

__init__(rtvi: RTVIProcessor | None = None, *, params: RTVIObserverParams | None = None, **kwargs)[source]

Initialize the RTVI observer.

Parameters:
  • rtvi – The RTVI processor to push frames to.

  • params – Settings to enable/disable specific messages.

  • **kwargs – Additional arguments passed to parent class.

add_bot_output_transformer(transform_function: Callable[[str, AggregationType | str], Awaitable[str]], aggregation_type: AggregationType | str = '*')[source]

Transform text for a specific aggregation type before sending as Bot Output or TTS.

Parameters:
  • transform_function – The function to apply for transformation. This function should take the text and aggregation type as input and return the transformed text. Ex.: async def my_transform(text: str, aggregation_type: str) -> str:

  • aggregation_type – The type of aggregation to transform. This value defaults to “*” to handle all text before sending to the client.

remove_bot_output_transformer(transform_function: Callable[[str, AggregationType | str], Awaitable[str]], aggregation_type: AggregationType | str = '*')[source]

Remove a text transformer for a specific aggregation type.

Parameters:
  • transform_function – The function to remove.

  • aggregation_type – The type of aggregation to remove the transformer for.

add_ignored_source(source: FrameProcessor)[source]

Ignore all frames pushed by the given processor.

Any frame whose source matches source will be silently skipped, preventing RTVI messages from being emitted for activity in that processor. Useful for suppressing events from secondary pipeline branches (e.g. a silent evaluation LLM) that should not be visible to clients.

Parameters:

source – The frame processor to ignore.

remove_ignored_source(source: FrameProcessor)[source]

Stop ignoring frames pushed by the given processor.

Reverses a previous call to add_ignored_source(). If source was not previously ignored this is a no-op.

Parameters:

source – The frame processor to stop ignoring.

async cleanup()[source]

Cleanup RTVI observer resources.

async send_rtvi_message(model: BaseModel, exclude_none: bool = True)[source]

Send an RTVI message.

By default, we push a transport frame. But this function can be overridden by subclass to send RTVI messages in different ways.

Parameters:
  • model – The message to send.

  • exclude_none – Whether to exclude None values from the model dump.

async on_push_frame(data: FramePushed)[source]

Process a frame being pushed through the pipeline.

Parameters:

data – Frame push event data containing source, frame, direction, and timestamp.

class pipecat.processors.frameworks.rtvi.RTVIObserverParams(bot_output_enabled: bool = True, bot_llm_enabled: bool = True, bot_tts_enabled: bool = True, bot_speaking_enabled: bool = True, bot_audio_level_enabled: bool = False, user_llm_enabled: bool = True, user_speaking_enabled: bool = True, user_mute_enabled: bool = True, user_transcription_enabled: bool = True, user_audio_level_enabled: bool = False, metrics_enabled: bool = True, system_logs_enabled: bool = False, ignored_sources: list[~pipecat.processors.frame_processor.FrameProcessor] = <factory>, skip_aggregator_types: list[~pipecat.utils.text.base_text_aggregator.AggregationType | str] | None = None, bot_output_transforms: list[tuple[~pipecat.utils.text.base_text_aggregator.AggregationType | str, ~collections.abc.Callable[[str, ~pipecat.utils.text.base_text_aggregator.AggregationType | str], ~collections.abc.Awaitable[str]]]] | None = None, audio_level_period_secs: float = 0.15, function_call_report_level: dict[str, ~pipecat.processors.frameworks.rtvi.observer.RTVIFunctionCallReportLevel] = <factory>)[source]

Bases: object

Parameters for configuring RTVI Observer behavior.

Parameters:
  • bot_output_enabled – Indicates if bot output messages should be sent.

  • bot_llm_enabled – Indicates if the bot’s LLM messages should be sent.

  • bot_tts_enabled – Indicates if the bot’s TTS messages should be sent.

  • bot_speaking_enabled – Indicates if the bot’s started/stopped speaking messages should be sent.

  • bot_audio_level_enabled – Indicates if bot’s audio level messages should be sent.

  • user_llm_enabled – Indicates if the user’s LLM input messages should be sent.

  • user_speaking_enabled – Indicates if the user’s started/stopped speaking messages should be sent.

  • user_transcription_enabled – Indicates if user’s transcription messages should be sent.

  • user_audio_level_enabled – Indicates if user’s audio level messages should be sent.

  • metrics_enabled – Indicates if metrics messages should be sent.

  • system_logs_enabled – Indicates if system logs should be sent.

  • ignored_sources – List of frame processors whose frames should be silently ignored by this observer. Useful for suppressing RTVI messages from secondary pipeline branches (e.g. a silent evaluation LLM) that should not be visible to clients. Sources can also be added and removed dynamically via add_ignored_source() and remove_ignored_source().

  • skip_aggregator_types – List of aggregation types to skip sending as tts/output messages. Note: if using this to avoid sending secure information, be sure to also disable bot_llm_enabled to avoid leaking through LLM messages.

  • bot_output_transforms – A list of callables to transform text before just before sending it to TTS. Each callable takes the aggregated text and its type, and returns the transformed text. To register, provide a list of tuples of (aggregation_type | ‘*’, transform_function).

  • audio_level_period_secs – How often audio levels should be sent if enabled.

  • function_call_report_level

    Controls what information is exposed in function call events for security. A dict mapping function names to levels, where "*" sets the default level for unlisted functions:

    function_call_report_level={
        "*": RTVIFunctionCallReportLevel.NONE,  # Default: events with no metadata
        "get_weather": RTVIFunctionCallReportLevel.FULL,  # Expose everything
    }
    
    Levels:
    • DISABLED: No events emitted for this function.

    • NONE: Events with tool_call_id only (most secure when events needed).

    • NAME: Adds function name to events.

    • FULL: Adds function name, arguments, and results.

    Defaults to {"*": RTVIFunctionCallReportLevel.NONE}.

bot_output_enabled: bool = True
bot_llm_enabled: bool = True
bot_tts_enabled: bool = True
bot_speaking_enabled: bool = True
bot_audio_level_enabled: bool = False
user_llm_enabled: bool = True
user_speaking_enabled: bool = True
user_mute_enabled: bool = True
user_transcription_enabled: bool = True
user_audio_level_enabled: bool = False
metrics_enabled: bool = True
system_logs_enabled: bool = False
ignored_sources: list[FrameProcessor]
skip_aggregator_types: list[AggregationType | str] | None = None
bot_output_transforms: list[tuple[AggregationType | str, Callable[[str, AggregationType | str], Awaitable[str]]]] | None = None
audio_level_period_secs: float = 0.15
function_call_report_level: dict[str, RTVIFunctionCallReportLevel]
class pipecat.processors.frameworks.rtvi.RTVIProcessor(*, transport: BaseTransport | None = None, **kwargs)[source]

Bases: FrameProcessor

Main processor for handling RTVI protocol messages and actions.

This processor manages the RTVI protocol communication including client-server handshaking, configuration management, action execution, and message routing. It serves as the central hub for RTVI protocol operations.

__init__(*, transport: BaseTransport | None = None, **kwargs)[source]

Initialize the RTVI processor.

Parameters:
  • transport – Transport layer for communication.

  • **kwargs – Additional arguments passed to parent class.

create_rtvi_observer(*, params: RTVIObserverParams | None = None, **kwargs)[source]

Creates a new RTVI Observer.

Parameters:
  • params – Settings to enable/disable specific messages.

  • **kwargs – Additional arguments passed to the observer.

Returns:

A new RTVI observer.

async set_client_ready()[source]

Mark the client as ready and trigger the ready event.

async set_bot_ready(about: Mapping[str, Any] = None)[source]

Mark the bot as ready and send the bot-ready message.

Parameters:

about – Optional information about the bot to include in the ready message. If left as None, the Pipecat library and version will be used.

async interrupt_bot()[source]

Send a bot interruption frame upstream.

async send_server_message(data: Any)[source]

Send a server message to the client.

async send_server_response(client_msg: ClientMessage, data: Any)[source]

Send a server response for a given client message.

async send_error_response(client_msg: ClientMessage, error: str)[source]

Send an error response for a given client message.

async send_error(error: str)[source]

Send an error message to the client.

Parameters:

error – The error message to send.

async push_transport_message(model: BaseModel, exclude_none: bool = True)[source]

Push a transport message frame.

async handle_message(message: Message)[source]

Handle an incoming RTVI message.

Parameters:

message – The RTVI message to handle.

async handle_function_call(params: FunctionCallParams)[source]

Handle a function call from the LLM.

Parameters:

params – The function call parameters.

Deprecated since version 0.0.102: This method is deprecated. Function call events are now automatically sent by RTVIObserver using the llm-function-call-in-progress event. Configure reporting level via RTVIObserverParams.function_call_report_level.

async process_frame(frame: Frame, direction: FrameDirection)[source]

Process incoming frames through the RTVI processor.

Parameters:
  • frame – The frame to process.

  • direction – The direction of frame flow.

class pipecat.processors.frameworks.rtvi.RTVIServerMessageFrame(data: Any)[source]

Bases: SystemFrame

A frame for sending server messages to the client.

Parameters:

data – The message data to send to the client.

data: Any
class pipecat.processors.frameworks.rtvi.RTVIServerResponseFrame(client_msg: RTVIClientMessageFrame, data: Any | None = None, error: str | None = None)[source]

Bases: SystemFrame

A frame for responding to a client RTVI message.

This frame should be sent in response to an RTVIClientMessageFrame and include the original RTVIClientMessageFrame to ensure the response is properly attributed to the original request. To respond with an error, set the error field to a string describing the error. This will result in the client receiving an error-response message instead of a server-response message.

client_msg: RTVIClientMessageFrame
data: Any | None = None
error: str | None = None

Submodules