events

Event models and data structures for OpenAI Realtime API communication.

class pipecat.services.openai.realtime.events.AudioFormat(*, type: str)[source]

Bases: BaseModel

Base class for audio format configuration.

type: str
class pipecat.services.openai.realtime.events.PCMAudioFormat(*, type: Literal['audio/pcm'] = 'audio/pcm', rate: Literal[24000] = 24000)[source]

Bases: AudioFormat

PCM audio format configuration.

Parameters:
  • type – Audio format type, always “audio/pcm”.

  • rate – Sample rate, always 24000 for PCM.

type: Literal['audio/pcm']
rate: Literal[24000]
class pipecat.services.openai.realtime.events.PCMUAudioFormat(*, type: Literal['audio/pcmu'] = 'audio/pcmu')[source]

Bases: AudioFormat

PCMU (G.711 μ-law) audio format configuration.

Parameters:

type – Audio format type, always “audio/pcmu”.

type: Literal['audio/pcmu']
class pipecat.services.openai.realtime.events.PCMAAudioFormat(*, type: Literal['audio/pcma'] = 'audio/pcma')[source]

Bases: AudioFormat

PCMA (G.711 A-law) audio format configuration.

Parameters:

type – Audio format type, always “audio/pcma”.

type: Literal['audio/pcma']
class pipecat.services.openai.realtime.events.InputAudioTranscription(model: str | None = 'gpt-4o-transcribe', language: str | None = None, prompt: str | None = None)[source]

Bases: BaseModel

Configuration for audio transcription settings.

model: str
language: str | None
prompt: str | None
__init__(model: str | None = 'gpt-4o-transcribe', language: str | None = None, prompt: str | None = None)[source]

Initialize InputAudioTranscription.

Parameters:
  • model – Transcription model to use (e.g., “gpt-4o-transcribe”, “whisper-1”).

  • language – Optional language code for transcription.

  • prompt – Optional transcription hint text.

class pipecat.services.openai.realtime.events.TurnDetection(*, type: Literal['server_vad'] | None = 'server_vad', threshold: float | None = 0.5, prefix_padding_ms: int | None = 300, silence_duration_ms: int | None = 500)[source]

Bases: BaseModel

Server-side voice activity detection configuration.

Parameters:
  • type – Detection type, must be “server_vad”.

  • threshold – Voice activity detection threshold (0.0-1.0). Defaults to 0.5.

  • prefix_padding_ms – Padding before speech starts in milliseconds. Defaults to 300.

  • silence_duration_ms – Silence duration to detect speech end in milliseconds. Defaults to 500.

type: Literal['server_vad'] | None
threshold: float | None
prefix_padding_ms: int | None
silence_duration_ms: int | None
class pipecat.services.openai.realtime.events.SemanticTurnDetection(*, type: Literal['semantic_vad'] | None = 'semantic_vad', eagerness: Literal['low', 'medium', 'high', 'auto'] | None = None, create_response: bool | None = None, interrupt_response: bool | None = None)[source]

Bases: BaseModel

Semantic-based turn detection configuration.

Parameters:
  • type – Detection type, must be “semantic_vad”.

  • eagerness – Turn detection eagerness level. Can be “low”, “medium”, “high”, or “auto”.

  • create_response – Whether to automatically create responses on turn detection.

  • interrupt_response – Whether to interrupt ongoing responses on turn detection.

type: Literal['semantic_vad'] | None
eagerness: Literal['low', 'medium', 'high', 'auto'] | None
create_response: bool | None
interrupt_response: bool | None
class pipecat.services.openai.realtime.events.InputAudioNoiseReduction(*, type: Literal['near_field', 'far_field'] | None)[source]

Bases: BaseModel

Input audio noise reduction configuration.

Parameters:

type – Noise reduction type for different microphone scenarios.

type: Literal['near_field', 'far_field'] | None
class pipecat.services.openai.realtime.events.AudioInput(*, format: PCMAudioFormat | PCMUAudioFormat | PCMAAudioFormat | None = None, transcription: InputAudioTranscription | None = None, noise_reduction: InputAudioNoiseReduction | None = None, turn_detection: TurnDetection | SemanticTurnDetection | bool | None = None)[source]

Bases: BaseModel

Audio input configuration.

Parameters:
  • format – The format of the input audio.

  • transcription – Configuration for input audio transcription.

  • noise_reduction – Configuration for input audio noise reduction.

  • turn_detection – Configuration for turn detection, or False to disable.

format: PCMAudioFormat | PCMUAudioFormat | PCMAAudioFormat | None
transcription: InputAudioTranscription | None
noise_reduction: InputAudioNoiseReduction | None
turn_detection: TurnDetection | SemanticTurnDetection | bool | None
class pipecat.services.openai.realtime.events.AudioOutput(*, format: PCMAudioFormat | PCMUAudioFormat | PCMAAudioFormat | None = None, voice: str | None = None, speed: float | None = None)[source]

Bases: BaseModel

Audio output configuration.

Parameters:
  • format – The format of the output audio.

  • voice – The voice the model uses to respond.

  • speed – The speed of the model’s spoken response.

format: PCMAudioFormat | PCMUAudioFormat | PCMAAudioFormat | None
voice: str | None
speed: float | None
class pipecat.services.openai.realtime.events.AudioConfiguration(*, input: AudioInput | None = None, output: AudioOutput | None = None)[source]

Bases: BaseModel

Audio configuration for input and output.

Parameters:
  • input – Configuration for input audio.

  • output – Configuration for output audio.

input: AudioInput | None
output: AudioOutput | None
class pipecat.services.openai.realtime.events.SessionProperties(*, type: Literal['realtime'] | None = 'realtime', object: Literal['realtime.session'] | None = None, id: str | None = None, model: str | None = None, output_modalities: list[Literal['text', 'audio']] | None = None, instructions: str | None = None, audio: AudioConfiguration | None = None, tools: ToolsSchema | list[dict] | None = None, tool_choice: Literal['auto', 'none', 'required'] | None = None, max_output_tokens: int | Literal['inf'] | None = None, tracing: Literal['auto'] | dict | None = None, prompt: dict | None = None, expires_at: int | None = None, include: list[str] | None = None)[source]

Bases: BaseModel

Configuration properties for an OpenAI Realtime session.

Parameters:
  • type – The type of session, always “realtime”.

  • object – Object type identifier, always “realtime.session”.

  • id – Unique identifier for the session.

  • model – The Realtime model used for this session. Note: The model is set at connection time via model arg in __init__ and cannot be changed during the session.

  • output_modalities – The set of modalities the model can respond with.

  • instructions – System instructions for the assistant.

  • audio – Configuration for input and output audio.

  • tools – Available function tools for the assistant.

  • tool_choice – Tool usage strategy (“auto”, “none”, or “required”).

  • max_output_tokens – Maximum tokens in response or “inf” for unlimited.

  • tracing – Configuration options for tracing.

  • prompt – Reference to a prompt template and its variables.

  • expires_at – Session expiration timestamp.

  • include – Additional fields to include in server outputs.

type: Literal['realtime'] | None
object: Literal['realtime.session'] | None
id: str | None
model: str | None
output_modalities: list[Literal['text', 'audio']] | None
instructions: str | None
audio: AudioConfiguration | None
tools: ToolsSchema | list[dict] | None
tool_choice: Literal['auto', 'none', 'required'] | None
max_output_tokens: int | Literal['inf'] | None
tracing: Literal['auto'] | dict | None
prompt: dict | None
expires_at: int | None
include: list[str] | None
class pipecat.services.openai.realtime.events.ItemContent(*, type: Literal['text', 'audio', 'input_text', 'input_audio', 'input_image', 'output_text', 'output_audio'], text: str | None = None, audio: str | None = None, transcript: str | None = None, image_url: str | None = None, detail: Literal['auto', 'low', 'high'] | None = None)[source]

Bases: BaseModel

Content within a conversation item.

Parameters:
  • type – Content type (text, audio, input_text, input_audio, input_image, output_text, or output_audio).

  • text – Text content for text-based items.

  • audio – Base64-encoded audio data for audio items.

  • transcript – Transcribed text for audio items.

  • image_url – Base64-encoded image data as a data URI for input_image items.

  • detail – Detail level for image processing (“auto”, “low”, or “high”).

type: Literal['text', 'audio', 'input_text', 'input_audio', 'input_image', 'output_text', 'output_audio']
text: str | None
audio: str | None
transcript: str | None
image_url: str | None
detail: Literal['auto', 'low', 'high'] | None
class pipecat.services.openai.realtime.events.ConversationItem(*, id: str = <factory>, object: ~typing.Literal['realtime.item'] | None = None, type: ~typing.Literal['message', 'function_call', 'function_call_output'], status: ~typing.Literal['completed', 'in_progress', 'incomplete'] | None = None, role: ~typing.Literal['user', 'assistant', 'system'] | None = None, content: list[~pipecat.services.openai.realtime.events.ItemContent] | None = None, call_id: str | None = None, name: str | None = None, arguments: str | None = None, output: str | None = None)[source]

Bases: BaseModel

A conversation item in the realtime session.

Parameters:
  • id – Unique identifier for the item, auto-generated if not provided.

  • object – Object type identifier for the realtime API.

  • type – Item type (message, function_call, or function_call_output).

  • status – Current status of the item.

  • role – Speaker role for message items (user, assistant, or system).

  • content – Content list for message items.

  • call_id – Function call identifier for function_call items.

  • name – Function name for function_call items.

  • arguments – Function arguments as JSON string for function_call items.

  • output – Function output as JSON string for function_call_output items.

id: str
object: Literal['realtime.item'] | None
type: Literal['message', 'function_call', 'function_call_output']
status: Literal['completed', 'in_progress', 'incomplete'] | None
role: Literal['user', 'assistant', 'system'] | None
content: list[ItemContent] | None
call_id: str | None
name: str | None
arguments: str | None
output: str | None
class pipecat.services.openai.realtime.events.RealtimeConversation(*, id: str, object: Literal['realtime.conversation'])[source]

Bases: BaseModel

A realtime conversation session.

Parameters:
  • id – Unique identifier for the conversation.

  • object – Object type identifier, always “realtime.conversation”.

id: str
object: Literal['realtime.conversation']
class pipecat.services.openai.realtime.events.ResponseProperties(*, output_modalities: list[Literal['text', 'audio']] | None = ['audio'], instructions: str | None = None, audio: AudioConfiguration | None = None, tools: list[dict] | None = None, tool_choice: Literal['auto', 'none', 'required'] | None = None, temperature: float | None = None, max_output_tokens: int | Literal['inf'] | None = None)[source]

Bases: BaseModel

Properties for configuring assistant responses.

Parameters:
  • output_modalities – Output modalities for the response. Must be either [“text”] or [“audio”]. Defaults to [“audio”].

  • instructions – Specific instructions for this response.

  • audio – Audio configuration for this response.

  • tools – Available tools for this response.

  • tool_choice – Tool usage strategy for this response.

  • temperature – Sampling temperature for this response.

  • max_output_tokens – Maximum tokens for this response.

output_modalities: list[Literal['text', 'audio']] | None
instructions: str | None
audio: AudioConfiguration | None
tools: list[dict] | None
tool_choice: Literal['auto', 'none', 'required'] | None
temperature: float | None
max_output_tokens: int | Literal['inf'] | None
class pipecat.services.openai.realtime.events.RealtimeError(*, type: str, code: str | None = '', message: str, param: str | None = None, event_id: str | None = None)[source]

Bases: BaseModel

Error information from the realtime API.

Parameters:
  • type – Error type identifier.

  • code – Specific error code.

  • message – Human-readable error message.

  • param – Parameter name that caused the error, if applicable.

  • event_id – Event ID associated with the error, if applicable.

type: str
code: str | None
message: str
param: str | None
event_id: str | None
class pipecat.services.openai.realtime.events.ClientEvent(*, event_id: str = <factory>)[source]

Bases: BaseModel

Base class for client events sent to the realtime API.

Parameters:

event_id – Unique identifier for the event, auto-generated if not provided.

event_id: str
class pipecat.services.openai.realtime.events.SessionUpdateEvent(*, event_id: str = <factory>, type: Literal['session.update'] = 'session.update', session: SessionProperties)[source]

Bases: ClientEvent

Event to update session properties.

Parameters:
  • type – Event type, always “session.update”.

  • session – Updated session properties.

type: Literal['session.update']
session: SessionProperties
model_dump(*args, **kwargs) dict[str, Any][source]

Serialize the event to a dictionary.

Handles special serialization for turn_detection where False becomes null.

Parameters:
  • *args – Positional arguments passed to parent model_dump.

  • **kwargs – Keyword arguments passed to parent model_dump.

Returns:

Dictionary representation of the event.

class pipecat.services.openai.realtime.events.InputAudioBufferAppendEvent(*, event_id: str = <factory>, type: Literal['input_audio_buffer.append'] = 'input_audio_buffer.append', audio: str)[source]

Bases: ClientEvent

Event to append audio data to the input buffer.

Parameters:
  • type – Event type, always “input_audio_buffer.append”.

  • audio – Base64-encoded audio data to append.

type: Literal['input_audio_buffer.append']
audio: str
class pipecat.services.openai.realtime.events.InputAudioBufferCommitEvent(*, event_id: str = <factory>, type: Literal['input_audio_buffer.commit'] = 'input_audio_buffer.commit')[source]

Bases: ClientEvent

Event to commit the current input audio buffer.

Parameters:

type – Event type, always “input_audio_buffer.commit”.

type: Literal['input_audio_buffer.commit']
class pipecat.services.openai.realtime.events.InputAudioBufferClearEvent(*, event_id: str = <factory>, type: Literal['input_audio_buffer.clear'] = 'input_audio_buffer.clear')[source]

Bases: ClientEvent

Event to clear the input audio buffer.

Parameters:

type – Event type, always “input_audio_buffer.clear”.

type: Literal['input_audio_buffer.clear']
class pipecat.services.openai.realtime.events.ConversationItemCreateEvent(*, event_id: str = <factory>, type: Literal['conversation.item.create'] = 'conversation.item.create', previous_item_id: str | None = None, item: ConversationItem)[source]

Bases: ClientEvent

Event to create a new conversation item.

Parameters:
  • type – Event type, always “conversation.item.create”.

  • previous_item_id – ID of the item to insert after, if any.

  • item – The conversation item to create.

type: Literal['conversation.item.create']
previous_item_id: str | None
item: ConversationItem
class pipecat.services.openai.realtime.events.ConversationItemTruncateEvent(*, event_id: str = <factory>, type: Literal['conversation.item.truncate'] = 'conversation.item.truncate', item_id: str, content_index: int, audio_end_ms: int)[source]

Bases: ClientEvent

Event to truncate a conversation item’s audio content.

Parameters:
  • type – Event type, always “conversation.item.truncate”.

  • item_id – ID of the item to truncate.

  • content_index – Index of the content to truncate within the item.

  • audio_end_ms – End time in milliseconds for the truncated audio.

type: Literal['conversation.item.truncate']
item_id: str
content_index: int
audio_end_ms: int
class pipecat.services.openai.realtime.events.ConversationItemDeleteEvent(*, event_id: str = <factory>, type: Literal['conversation.item.delete'] = 'conversation.item.delete', item_id: str)[source]

Bases: ClientEvent

Event to delete a conversation item.

Parameters:
  • type – Event type, always “conversation.item.delete”.

  • item_id – ID of the item to delete.

type: Literal['conversation.item.delete']
item_id: str
class pipecat.services.openai.realtime.events.ConversationItemRetrieveEvent(*, event_id: str = <factory>, type: Literal['conversation.item.retrieve'] = 'conversation.item.retrieve', item_id: str)[source]

Bases: ClientEvent

Event to retrieve a conversation item by ID.

Parameters:
  • type – Event type, always “conversation.item.retrieve”.

  • item_id – ID of the item to retrieve.

type: Literal['conversation.item.retrieve']
item_id: str
class pipecat.services.openai.realtime.events.ResponseCreateEvent(*, event_id: str = <factory>, type: Literal['response.create'] = 'response.create', response: ResponseProperties | None = None)[source]

Bases: ClientEvent

Event to create a new assistant response.

Parameters:
  • type – Event type, always “response.create”.

  • response – Optional response configuration properties.

type: Literal['response.create']
response: ResponseProperties | None
class pipecat.services.openai.realtime.events.ResponseCancelEvent(*, event_id: str = <factory>, type: Literal['response.cancel'] = 'response.cancel')[source]

Bases: ClientEvent

Event to cancel the current assistant response.

Parameters:

type – Event type, always “response.cancel”.

type: Literal['response.cancel']
class pipecat.services.openai.realtime.events.ServerEvent(*, event_id: str, type: str)[source]

Bases: BaseModel

Base class for server events received from the realtime API.

Parameters:
  • event_id – Unique identifier for the event.

  • type – Type of the server event.

event_id: str
type: str
class pipecat.services.openai.realtime.events.SessionCreatedEvent(*, event_id: str, type: Literal['session.created'], session: SessionProperties)[source]

Bases: ServerEvent

Event indicating a session has been created.

Parameters:
  • type – Event type, always “session.created”.

  • session – The created session properties.

type: Literal['session.created']
session: SessionProperties
class pipecat.services.openai.realtime.events.SessionUpdatedEvent(*, event_id: str, type: Literal['session.updated'], session: SessionProperties)[source]

Bases: ServerEvent

Event indicating a session has been updated.

Parameters:
  • type – Event type, always “session.updated”.

  • session – The updated session properties.

type: Literal['session.updated']
session: SessionProperties
class pipecat.services.openai.realtime.events.ConversationCreated(*, event_id: str, type: Literal['conversation.created'], conversation: RealtimeConversation)[source]

Bases: ServerEvent

Event indicating a conversation has been created.

Parameters:
  • type – Event type, always “conversation.created”.

  • conversation – The created conversation.

type: Literal['conversation.created']
conversation: RealtimeConversation
class pipecat.services.openai.realtime.events.ConversationItemAdded(*, event_id: str, type: Literal['conversation.item.added'], previous_item_id: str | None = None, item: ConversationItem)[source]

Bases: ServerEvent

Event indicating a conversation item has been added.

Parameters:
  • type – Event type, always “conversation.item.added”.

  • previous_item_id – ID of the previous item, if any.

  • item – The added conversation item.

type: Literal['conversation.item.added']
previous_item_id: str | None
item: ConversationItem
class pipecat.services.openai.realtime.events.ConversationItemDone(*, event_id: str, type: Literal['conversation.item.done'], previous_item_id: str | None = None, item: ConversationItem)[source]

Bases: ServerEvent

Event indicating a conversation item is done processing.

Parameters:
  • type – Event type, always “conversation.item.done”.

  • previous_item_id – ID of the previous item, if any.

  • item – The completed conversation item.

type: Literal['conversation.item.done']
previous_item_id: str | None
item: ConversationItem
class pipecat.services.openai.realtime.events.ConversationItemInputAudioTranscriptionDelta(*, event_id: str, type: Literal['conversation.item.input_audio_transcription.delta'], item_id: str, content_index: int, delta: str)[source]

Bases: ServerEvent

Event containing incremental input audio transcription.

Parameters:
  • type – Event type, always “conversation.item.input_audio_transcription.delta”.

  • item_id – ID of the conversation item being transcribed.

  • content_index – Index of the content within the item.

  • delta – Incremental transcription text.

type: Literal['conversation.item.input_audio_transcription.delta']
item_id: str
content_index: int
delta: str
class pipecat.services.openai.realtime.events.ConversationItemInputAudioTranscriptionCompleted(*, event_id: str, type: Literal['conversation.item.input_audio_transcription.completed'], item_id: str, content_index: int, transcript: str)[source]

Bases: ServerEvent

Event indicating input audio transcription is complete.

Parameters:
  • type – Event type, always “conversation.item.input_audio_transcription.completed”.

  • item_id – ID of the conversation item that was transcribed.

  • content_index – Index of the content within the item.

  • transcript – Complete transcription text.

type: Literal['conversation.item.input_audio_transcription.completed']
item_id: str
content_index: int
transcript: str
class pipecat.services.openai.realtime.events.ConversationItemInputAudioTranscriptionFailed(*, event_id: str, type: Literal['conversation.item.input_audio_transcription.failed'], item_id: str, content_index: int, error: RealtimeError)[source]

Bases: ServerEvent

Event indicating input audio transcription failed.

Parameters:
  • type – Event type, always “conversation.item.input_audio_transcription.failed”.

  • item_id – ID of the conversation item that failed transcription.

  • content_index – Index of the content within the item.

  • error – Error details for the transcription failure.

type: Literal['conversation.item.input_audio_transcription.failed']
item_id: str
content_index: int
error: RealtimeError
class pipecat.services.openai.realtime.events.ConversationItemTruncated(*, event_id: str, type: Literal['conversation.item.truncated'], item_id: str, content_index: int, audio_end_ms: int)[source]

Bases: ServerEvent

Event indicating a conversation item has been truncated.

Parameters:
  • type – Event type, always “conversation.item.truncated”.

  • item_id – ID of the truncated conversation item.

  • content_index – Index of the content within the item.

  • audio_end_ms – End time in milliseconds for the truncated audio.

type: Literal['conversation.item.truncated']
item_id: str
content_index: int
audio_end_ms: int
class pipecat.services.openai.realtime.events.ConversationItemDeleted(*, event_id: str, type: Literal['conversation.item.deleted'], item_id: str)[source]

Bases: ServerEvent

Event indicating a conversation item has been deleted.

Parameters:
  • type – Event type, always “conversation.item.deleted”.

  • item_id – ID of the deleted conversation item.

type: Literal['conversation.item.deleted']
item_id: str
class pipecat.services.openai.realtime.events.ConversationItemRetrieved(*, event_id: str, type: Literal['conversation.item.retrieved'], item: ConversationItem)[source]

Bases: ServerEvent

Event containing a retrieved conversation item.

Parameters:
  • type – Event type, always “conversation.item.retrieved”.

  • item – The retrieved conversation item.

type: Literal['conversation.item.retrieved']
item: ConversationItem
class pipecat.services.openai.realtime.events.ResponseCreated(*, event_id: str, type: str)[source]

Bases: ServerEvent

Event indicating an assistant response has been created.

Parameters:
  • type – Event type, always “response.created”.

  • response – The created response object.

type: Literal['response.created']
response: Response
class pipecat.services.openai.realtime.events.ResponseDone(*, event_id: str, type: str)[source]

Bases: ServerEvent

Event indicating an assistant response is complete.

Parameters:
  • type – Event type, always “response.done”.

  • response – The completed response object.

type: Literal['response.done']
response: Response
class pipecat.services.openai.realtime.events.ResponseOutputItemAdded(*, event_id: str, type: Literal['response.output_item.added'], response_id: str, output_index: int, item: ConversationItem)[source]

Bases: ServerEvent

Event indicating an output item has been added to a response.

Parameters:
  • type – Event type, always “response.output_item.added”.

  • response_id – ID of the response.

  • output_index – Index of the output item.

  • item – The added conversation item.

type: Literal['response.output_item.added']
response_id: str
output_index: int
item: ConversationItem
class pipecat.services.openai.realtime.events.ResponseOutputItemDone(*, event_id: str, type: Literal['response.output_item.done'], response_id: str, output_index: int, item: ConversationItem)[source]

Bases: ServerEvent

Event indicating an output item is complete.

Parameters:
  • type – Event type, always “response.output_item.done”.

  • response_id – ID of the response.

  • output_index – Index of the output item.

  • item – The completed conversation item.

type: Literal['response.output_item.done']
response_id: str
output_index: int
item: ConversationItem
class pipecat.services.openai.realtime.events.ResponseContentPartAdded(*, event_id: str, type: Literal['response.content_part.added'], response_id: str, item_id: str, output_index: int, content_index: int, part: ItemContent)[source]

Bases: ServerEvent

Event indicating a content part has been added to a response.

Parameters:
  • type – Event type, always “response.content_part.added”.

  • response_id – ID of the response.

  • item_id – ID of the conversation item.

  • output_index – Index of the output item.

  • content_index – Index of the content part.

  • part – The added content part.

type: Literal['response.content_part.added']
response_id: str
item_id: str
output_index: int
content_index: int
part: ItemContent
class pipecat.services.openai.realtime.events.ResponseContentPartDone(*, event_id: str, type: Literal['response.content_part.done'], response_id: str, item_id: str, output_index: int, content_index: int, part: ItemContent)[source]

Bases: ServerEvent

Event indicating a content part is complete.

Parameters:
  • type – Event type, always “response.content_part.done”.

  • response_id – ID of the response.

  • item_id – ID of the conversation item.

  • output_index – Index of the output item.

  • content_index – Index of the content part.

  • part – The completed content part.

type: Literal['response.content_part.done']
response_id: str
item_id: str
output_index: int
content_index: int
part: ItemContent
class pipecat.services.openai.realtime.events.ResponseTextDelta(*, event_id: str, type: Literal['response.output_text.delta'], response_id: str, item_id: str, output_index: int, content_index: int, delta: str)[source]

Bases: ServerEvent

Event containing incremental text from a response.

Parameters:
  • type – Event type, always “response.output_text.delta”.

  • response_id – ID of the response.

  • item_id – ID of the conversation item.

  • output_index – Index of the output item.

  • content_index – Index of the content part.

  • delta – Incremental text content.

type: Literal['response.output_text.delta']
response_id: str
item_id: str
output_index: int
content_index: int
delta: str
class pipecat.services.openai.realtime.events.ResponseTextDone(*, event_id: str, type: Literal['response.output_text.done'], response_id: str, item_id: str, output_index: int, content_index: int, text: str)[source]

Bases: ServerEvent

Event indicating text content is complete.

Parameters:
  • type – Event type, always “response.output_text.done”.

  • response_id – ID of the response.

  • item_id – ID of the conversation item.

  • output_index – Index of the output item.

  • content_index – Index of the content part.

  • text – Complete text content.

type: Literal['response.output_text.done']
response_id: str
item_id: str
output_index: int
content_index: int
text: str
class pipecat.services.openai.realtime.events.ResponseAudioTranscriptDelta(*, event_id: str, type: Literal['response.output_audio_transcript.delta'], response_id: str, item_id: str, output_index: int, content_index: int, delta: str)[source]

Bases: ServerEvent

Event containing incremental audio transcript from a response.

Parameters:
  • type – Event type, always “response.output_audio_transcript.delta”.

  • response_id – ID of the response.

  • item_id – ID of the conversation item.

  • output_index – Index of the output item.

  • content_index – Index of the content part.

  • delta – Incremental transcript text.

type: Literal['response.output_audio_transcript.delta']
response_id: str
item_id: str
output_index: int
content_index: int
delta: str
class pipecat.services.openai.realtime.events.ResponseAudioTranscriptDone(*, event_id: str, type: Literal['response.output_audio_transcript.done'], response_id: str, item_id: str, output_index: int, content_index: int, transcript: str)[source]

Bases: ServerEvent

Event indicating audio transcript is complete.

Parameters:
  • type – Event type, always “response.output_audio_transcript.done”.

  • response_id – ID of the response.

  • item_id – ID of the conversation item.

  • output_index – Index of the output item.

  • content_index – Index of the content part.

  • transcript – Complete transcript text.

type: Literal['response.output_audio_transcript.done']
response_id: str
item_id: str
output_index: int
content_index: int
transcript: str
class pipecat.services.openai.realtime.events.ResponseAudioDelta(*, event_id: str, type: Literal['response.output_audio.delta'], response_id: str, item_id: str, output_index: int, content_index: int, delta: str)[source]

Bases: ServerEvent

Event containing incremental audio data from a response.

Parameters:
  • type – Event type, always “response.output_audio.delta”.

  • response_id – ID of the response.

  • item_id – ID of the conversation item.

  • output_index – Index of the output item.

  • content_index – Index of the content part.

  • delta – Base64-encoded incremental audio data.

type: Literal['response.output_audio.delta']
response_id: str
item_id: str
output_index: int
content_index: int
delta: str
class pipecat.services.openai.realtime.events.ResponseAudioDone(*, event_id: str, type: Literal['response.output_audio.done'], response_id: str, item_id: str, output_index: int, content_index: int)[source]

Bases: ServerEvent

Event indicating audio content is complete.

Parameters:
  • type – Event type, always “response.output_audio.done”.

  • response_id – ID of the response.

  • item_id – ID of the conversation item.

  • output_index – Index of the output item.

  • content_index – Index of the content part.

type: Literal['response.output_audio.done']
response_id: str
item_id: str
output_index: int
content_index: int
class pipecat.services.openai.realtime.events.ResponseFunctionCallArgumentsDelta(*, event_id: str, type: Literal['response.function_call_arguments.delta'], response_id: str, item_id: str, output_index: int, call_id: str, delta: str)[source]

Bases: ServerEvent

Event containing incremental function call arguments.

Parameters:
  • type – Event type, always “response.function_call_arguments.delta”.

  • response_id – ID of the response.

  • item_id – ID of the conversation item.

  • output_index – Index of the output item.

  • call_id – ID of the function call.

  • delta – Incremental function arguments as JSON.

type: Literal['response.function_call_arguments.delta']
response_id: str
item_id: str
output_index: int
call_id: str
delta: str
class pipecat.services.openai.realtime.events.ResponseFunctionCallArgumentsDone(*, event_id: str, type: Literal['response.function_call_arguments.done'], response_id: str, item_id: str, output_index: int, call_id: str, arguments: str)[source]

Bases: ServerEvent

Event indicating function call arguments are complete.

Parameters:
  • type – Event type, always “response.function_call_arguments.done”.

  • response_id – ID of the response.

  • item_id – ID of the conversation item.

  • output_index – Index of the output item.

  • call_id – ID of the function call.

  • arguments – Complete function arguments as JSON string.

type: Literal['response.function_call_arguments.done']
response_id: str
item_id: str
output_index: int
call_id: str
arguments: str
class pipecat.services.openai.realtime.events.InputAudioBufferSpeechStarted(*, event_id: str, type: Literal['input_audio_buffer.speech_started'], audio_start_ms: int, item_id: str)[source]

Bases: ServerEvent

Event indicating speech has started in the input audio buffer.

Parameters:
  • type – Event type, always “input_audio_buffer.speech_started”.

  • audio_start_ms – Start time of speech in milliseconds.

  • item_id – ID of the associated conversation item.

type: Literal['input_audio_buffer.speech_started']
audio_start_ms: int
item_id: str
class pipecat.services.openai.realtime.events.InputAudioBufferSpeechStopped(*, event_id: str, type: Literal['input_audio_buffer.speech_stopped'], audio_end_ms: int, item_id: str)[source]

Bases: ServerEvent

Event indicating speech has stopped in the input audio buffer.

Parameters:
  • type – Event type, always “input_audio_buffer.speech_stopped”.

  • audio_end_ms – End time of speech in milliseconds.

  • item_id – ID of the associated conversation item.

type: Literal['input_audio_buffer.speech_stopped']
audio_end_ms: int
item_id: str
class pipecat.services.openai.realtime.events.InputAudioBufferCommitted(*, event_id: str, type: Literal['input_audio_buffer.committed'], previous_item_id: str | None = None, item_id: str)[source]

Bases: ServerEvent

Event indicating the input audio buffer has been committed.

Parameters:
  • type – Event type, always “input_audio_buffer.committed”.

  • previous_item_id – ID of the previous item, if any.

  • item_id – ID of the committed conversation item.

type: Literal['input_audio_buffer.committed']
previous_item_id: str | None
item_id: str
class pipecat.services.openai.realtime.events.InputAudioBufferCleared(*, event_id: str, type: Literal['input_audio_buffer.cleared'])[source]

Bases: ServerEvent

Event indicating the input audio buffer has been cleared.

Parameters:

type – Event type, always “input_audio_buffer.cleared”.

type: Literal['input_audio_buffer.cleared']
class pipecat.services.openai.realtime.events.ErrorEvent(*, event_id: str, type: Literal['error'], error: RealtimeError)[source]

Bases: ServerEvent

Event indicating an error occurred.

Parameters:
  • type – Event type, always “error”.

  • error – Error details.

type: Literal['error']
error: RealtimeError
class pipecat.services.openai.realtime.events.RateLimitsUpdated(*, event_id: str, type: Literal['rate_limits.updated'], rate_limits: list[dict[str, Any]])[source]

Bases: ServerEvent

Event indicating rate limits have been updated.

Parameters:
  • type – Event type, always “rate_limits.updated”.

  • rate_limits – List of rate limit information.

type: Literal['rate_limits.updated']
rate_limits: list[dict[str, Any]]
class pipecat.services.openai.realtime.events.CachedTokensDetails(*, text_tokens: int | None = 0, audio_tokens: int | None = 0)[source]

Bases: BaseModel

Details about cached tokens.

Parameters:
  • text_tokens – Number of cached text tokens.

  • audio_tokens – Number of cached audio tokens.

text_tokens: int | None
audio_tokens: int | None
class pipecat.services.openai.realtime.events.TokenDetails(*, cached_tokens: int | None = 0, text_tokens: int | None = 0, audio_tokens: int | None = 0, cached_tokens_details: CachedTokensDetails | None = None, image_tokens: int | None = 0, **extra_data: Any)[source]

Bases: BaseModel

Detailed token usage information.

Parameters:
  • cached_tokens – Number of cached tokens used. Defaults to 0.

  • text_tokens – Number of text tokens used. Defaults to 0.

  • audio_tokens – Number of audio tokens used. Defaults to 0.

  • cached_tokens_details – Detailed breakdown of cached tokens.

  • image_tokens – Number of image tokens used (for input only).

cached_tokens: int | None
text_tokens: int | None
audio_tokens: int | None
cached_tokens_details: CachedTokensDetails | None
image_tokens: int | None
class pipecat.services.openai.realtime.events.Usage(*, total_tokens: int, input_tokens: int, output_tokens: int, input_token_details: TokenDetails, output_token_details: TokenDetails)[source]

Bases: BaseModel

Token usage statistics for a response.

Parameters:
  • total_tokens – Total number of tokens used.

  • input_tokens – Number of input tokens used.

  • output_tokens – Number of output tokens used.

  • input_token_details – Detailed breakdown of input token usage.

  • output_token_details – Detailed breakdown of output token usage.

total_tokens: int
input_tokens: int
output_tokens: int
input_token_details: TokenDetails
output_token_details: TokenDetails
class pipecat.services.openai.realtime.events.Response(*, id: str, object: Literal['realtime.response'], status: Literal['completed', 'in_progress', 'incomplete', 'cancelled', 'failed'], status_details: Any, output: list[ConversationItem], output_modalities: list[Literal['text', 'audio']] | None = None, max_output_tokens: int | Literal['inf'] | None = None, audio: AudioConfiguration | None = None, usage: Usage | None = None, voice: str | None = None, temperature: float | None = None, output_audio_format: str | None = None)[source]

Bases: BaseModel

A complete assistant response.

Parameters:
  • id – Unique identifier for the response.

  • object – Object type, always “realtime.response”.

  • status – Current status of the response.

  • status_details – Additional status information.

  • output – List of conversation items in the response.

  • conversation_id – Which conversation the response is added to.

  • output_modalities – The set of modalities the model used to respond.

  • max_output_tokens – Maximum number of output tokens used.

  • audio – Audio configuration for the response.

  • usage – Token usage statistics for the response.

  • voice – The voice the model used to respond.

  • temperature – Sampling temperature used for the response.

  • output_audio_format – The format of output audio.

id: str
object: Literal['realtime.response']
status: Literal['completed', 'in_progress', 'incomplete', 'cancelled', 'failed']
status_details: Any
output: list[ConversationItem]
output_modalities: list[Literal['text', 'audio']] | None
max_output_tokens: int | Literal['inf'] | None
audio: AudioConfiguration | None
usage: Usage | None
voice: str | None
temperature: float | None
output_audio_format: str | None
pipecat.services.openai.realtime.events.parse_server_event(str)[source]

Parse a server event from JSON string.

Parameters:

str – JSON string containing the server event.

Returns:

Parsed server event object of the appropriate type.

Raises:

Exception – If the event type is unimplemented or parsing fails.