observer

RTVI observer for converting pipeline frames to outgoing RTVI messages.

class pipecat.processors.frameworks.rtvi.observer.RTVIFunctionCallReportLevel(*values)[source]

Bases: StrEnum

Level of detail to include in function call RTVI events.

Controls what information is exposed in function call events for security.

Values:: DISABLED: No events emitted for this function call. NONE: Events only with tool_call_id, no function name or metadata (most secure). NAME: Events with function name, no arguments or results. FULL: Events with function name, arguments, and results.

DISABLED = 'disabled'

NONE = 'none'

NAME = 'name'

FULL = 'full'

class pipecat.processors.frameworks.rtvi.observer.RTVIObserverParams(bot_output_enabled: bool = True, bot_llm_enabled: bool = True, bot_tts_enabled: bool = True, bot_speaking_enabled: bool = True, bot_audio_level_enabled: bool = False, user_llm_enabled: bool = True, user_speaking_enabled: bool = True, user_mute_enabled: bool = True, user_transcription_enabled: bool = True, user_audio_level_enabled: bool = False, metrics_enabled: bool = True, system_logs_enabled: bool = False, ignored_sources: list[~pipecat.processors.frame_processor.FrameProcessor] = <factory>, skip_aggregator_types: list[~pipecat.utils.text.base_text_aggregator.AggregationType | str] | None = None, bot_output_transforms: list[tuple[~pipecat.utils.text.base_text_aggregator.AggregationType | str, ~collections.abc.Callable[[str, ~pipecat.utils.text.base_text_aggregator.AggregationType | str], ~collections.abc.Awaitable[str]]]] | None = None, audio_level_period_secs: float = 0.15, function_call_report_level: dict[str, ~pipecat.processors.frameworks.rtvi.observer.RTVIFunctionCallReportLevel] = <factory>)[source]

Bases: object

Parameters for configuring RTVI Observer behavior.

Parameters:

bot_output_enabled – Indicates if bot output messages should be sent.
bot_llm_enabled – Indicates if the bot’s LLM messages should be sent.
bot_tts_enabled – Indicates if the bot’s TTS messages should be sent.
bot_speaking_enabled – Indicates if the bot’s started/stopped speaking messages should be sent.
bot_audio_level_enabled – Indicates if bot’s audio level messages should be sent.
user_llm_enabled – Indicates if the user’s LLM input messages should be sent.
user_speaking_enabled – Indicates if the user’s started/stopped speaking messages should be sent.
user_transcription_enabled – Indicates if user’s transcription messages should be sent.
user_audio_level_enabled – Indicates if user’s audio level messages should be sent.
metrics_enabled – Indicates if metrics messages should be sent.
system_logs_enabled – Indicates if system logs should be sent.
ignored_sources – List of frame processors whose frames should be silently ignored by this observer. Useful for suppressing RTVI messages from secondary pipeline branches (e.g. a silent evaluation LLM) that should not be visible to clients. Sources can also be added and removed dynamically via add_ignored_source() and remove_ignored_source().
skip_aggregator_types – List of aggregation types to skip sending as tts/output messages. Note: if using this to avoid sending secure information, be sure to also disable bot_llm_enabled to avoid leaking through LLM messages.
bot_output_transforms – A list of callables to transform text before just before sending it to TTS. Each callable takes the aggregated text and its type, and returns the transformed text. To register, provide a list of tuples of (aggregation_type | ‘*’, transform_function).
audio_level_period_secs – How often audio levels should be sent if enabled.
function_call_report_level –
Controls what information is exposed in function call events for security. A dict mapping function names to levels, where "*" sets the default level for unlisted functions:
```
function_call_report_level={
    "*": RTVIFunctionCallReportLevel.NONE,  # Default: events with no metadata
    "get_weather": RTVIFunctionCallReportLevel.FULL,  # Expose everything
}
```
Levels:
- DISABLED: No events emitted for this function.
- NONE: Events with tool_call_id only (most secure when events needed).
- NAME: Adds function name to events.
- FULL: Adds function name, arguments, and results.
Defaults to {"*": RTVIFunctionCallReportLevel.NONE}.

bot_output_enabled: bool = True

bot_llm_enabled: bool = True

bot_tts_enabled: bool = True

bot_speaking_enabled: bool = True

bot_audio_level_enabled: bool = False

user_llm_enabled: bool = True

user_speaking_enabled: bool = True

user_mute_enabled: bool = True

user_transcription_enabled: bool = True

user_audio_level_enabled: bool = False

metrics_enabled: bool = True

system_logs_enabled: bool = False

ignored_sources: list[FrameProcessor]

skip_aggregator_types: list[AggregationType | str] | None = None

bot_output_transforms: list[tuple[AggregationType | str, Callable[[str, AggregationType | str], Awaitable[str]]]] | None = None

audio_level_period_secs: float = 0.15

function_call_report_level: dict[str, RTVIFunctionCallReportLevel]

class pipecat.processors.frameworks.rtvi.observer.RTVIObserver(rtvi: RTVIProcessor | None = None, *, params: RTVIObserverParams | None = None, **kwargs)[source]

Bases: BaseObserver

Pipeline frame observer for RTVI server message handling.

This observer monitors pipeline frames and converts them into appropriate RTVI messages for client communication. It handles various frame types including speech events, transcriptions, LLM responses, and TTS events.

Note

This observer only handles outgoing messages. Incoming RTVI client messages are handled by the RTVIProcessor.

__init__(rtvi: RTVIProcessor | None = None, *, params: RTVIObserverParams | None = None, **kwargs)[source]

Initialize the RTVI observer.

Parameters:

rtvi – The RTVI processor to push frames to.
params – Settings to enable/disable specific messages.
**kwargs – Additional arguments passed to parent class.

add_bot_output_transformer(transform_function: Callable[[str, AggregationType | str], Awaitable[str]], aggregation_type: AggregationType | str = '*')[source]

Transform text for a specific aggregation type before sending as Bot Output or TTS.

Parameters:

transform_function – The function to apply for transformation. This function should take the text and aggregation type as input and return the transformed text. Ex.: async def my_transform(text: str, aggregation_type: str) -> str:
aggregation_type – The type of aggregation to transform. This value defaults to “*” to handle all text before sending to the client.

remove_bot_output_transformer(transform_function: Callable[[str, AggregationType | str], Awaitable[str]], aggregation_type: AggregationType | str = '*')[source]

Remove a text transformer for a specific aggregation type.

Parameters:

transform_function – The function to remove.
aggregation_type – The type of aggregation to remove the transformer for.

add_ignored_source(source: FrameProcessor)[source]

Ignore all frames pushed by the given processor.

Any frame whose source matches source will be silently skipped, preventing RTVI messages from being emitted for activity in that processor. Useful for suppressing events from secondary pipeline branches (e.g. a silent evaluation LLM) that should not be visible to clients.

Parameters:: source – The frame processor to ignore.

remove_ignored_source(source: FrameProcessor)[source]

Stop ignoring frames pushed by the given processor.

Reverses a previous call to add_ignored_source(). If source was not previously ignored this is a no-op.

Parameters:: source – The frame processor to stop ignoring.

async cleanup()[source]: Cleanup RTVI observer resources.

async send_rtvi_message(model: BaseModel, exclude_none: bool = True)[source]

Send an RTVI message.

By default, we push a transport frame. But this function can be overridden by subclass to send RTVI messages in different ways.

Parameters:

model – The message to send.
exclude_none – Whether to exclude None values from the model dump.

async on_push_frame(data: FramePushed)[source]

Process a frame being pushed through the pipeline.

Parameters:: data – Frame push event data containing source, frame, direction, and timestamp.