events
Event models and data structures for OpenAI Realtime API communication.
- class pipecat.services.openai.realtime.events.AudioFormat(*, type: str)[source]
Bases:
BaseModelBase class for audio format configuration.
- type: str
- class pipecat.services.openai.realtime.events.PCMAudioFormat(*, type: Literal['audio/pcm'] = 'audio/pcm', rate: Literal[24000] = 24000)[source]
Bases:
AudioFormatPCM audio format configuration.
- Parameters:
type – Audio format type, always “audio/pcm”.
rate – Sample rate, always 24000 for PCM.
- type: Literal['audio/pcm']
- rate: Literal[24000]
- class pipecat.services.openai.realtime.events.PCMUAudioFormat(*, type: Literal['audio/pcmu'] = 'audio/pcmu')[source]
Bases:
AudioFormatPCMU (G.711 μ-law) audio format configuration.
- Parameters:
type – Audio format type, always “audio/pcmu”.
- type: Literal['audio/pcmu']
- class pipecat.services.openai.realtime.events.PCMAAudioFormat(*, type: Literal['audio/pcma'] = 'audio/pcma')[source]
Bases:
AudioFormatPCMA (G.711 A-law) audio format configuration.
- Parameters:
type – Audio format type, always “audio/pcma”.
- type: Literal['audio/pcma']
- class pipecat.services.openai.realtime.events.InputAudioTranscription(model: str | None = 'gpt-4o-transcribe', language: str | None = None, prompt: str | None = None)[source]
Bases:
BaseModelConfiguration for audio transcription settings.
- model: str
- language: str | None
- prompt: str | None
- __init__(model: str | None = 'gpt-4o-transcribe', language: str | None = None, prompt: str | None = None)[source]
Initialize InputAudioTranscription.
- Parameters:
model – Transcription model to use (e.g., “gpt-4o-transcribe”, “whisper-1”).
language – Optional language code for transcription.
prompt – Optional transcription hint text.
- class pipecat.services.openai.realtime.events.TurnDetection(*, type: Literal['server_vad'] | None = 'server_vad', threshold: float | None = 0.5, prefix_padding_ms: int | None = 300, silence_duration_ms: int | None = 500)[source]
Bases:
BaseModelServer-side voice activity detection configuration.
- Parameters:
type – Detection type, must be “server_vad”.
threshold – Voice activity detection threshold (0.0-1.0). Defaults to 0.5.
prefix_padding_ms – Padding before speech starts in milliseconds. Defaults to 300.
silence_duration_ms – Silence duration to detect speech end in milliseconds. Defaults to 500.
- type: Literal['server_vad'] | None
- threshold: float | None
- prefix_padding_ms: int | None
- silence_duration_ms: int | None
- class pipecat.services.openai.realtime.events.SemanticTurnDetection(*, type: Literal['semantic_vad'] | None = 'semantic_vad', eagerness: Literal['low', 'medium', 'high', 'auto'] | None = None, create_response: bool | None = None, interrupt_response: bool | None = None)[source]
Bases:
BaseModelSemantic-based turn detection configuration.
- Parameters:
type – Detection type, must be “semantic_vad”.
eagerness – Turn detection eagerness level. Can be “low”, “medium”, “high”, or “auto”.
create_response – Whether to automatically create responses on turn detection.
interrupt_response – Whether to interrupt ongoing responses on turn detection.
- type: Literal['semantic_vad'] | None
- eagerness: Literal['low', 'medium', 'high', 'auto'] | None
- create_response: bool | None
- interrupt_response: bool | None
- class pipecat.services.openai.realtime.events.InputAudioNoiseReduction(*, type: Literal['near_field', 'far_field'] | None)[source]
Bases:
BaseModelInput audio noise reduction configuration.
- Parameters:
type – Noise reduction type for different microphone scenarios.
- type: Literal['near_field', 'far_field'] | None
- class pipecat.services.openai.realtime.events.AudioInput(*, format: PCMAudioFormat | PCMUAudioFormat | PCMAAudioFormat | None = None, transcription: InputAudioTranscription | None = None, noise_reduction: InputAudioNoiseReduction | None = None, turn_detection: TurnDetection | SemanticTurnDetection | bool | None = None)[source]
Bases:
BaseModelAudio input configuration.
- Parameters:
format – The format of the input audio.
transcription – Configuration for input audio transcription.
noise_reduction – Configuration for input audio noise reduction.
turn_detection – Configuration for turn detection, or False to disable.
- format: PCMAudioFormat | PCMUAudioFormat | PCMAAudioFormat | None
- transcription: InputAudioTranscription | None
- noise_reduction: InputAudioNoiseReduction | None
- turn_detection: TurnDetection | SemanticTurnDetection | bool | None
- class pipecat.services.openai.realtime.events.AudioOutput(*, format: PCMAudioFormat | PCMUAudioFormat | PCMAAudioFormat | None = None, voice: str | None = None, speed: float | None = None)[source]
Bases:
BaseModelAudio output configuration.
- Parameters:
format – The format of the output audio.
voice – The voice the model uses to respond.
speed – The speed of the model’s spoken response.
- format: PCMAudioFormat | PCMUAudioFormat | PCMAAudioFormat | None
- voice: str | None
- speed: float | None
- class pipecat.services.openai.realtime.events.AudioConfiguration(*, input: AudioInput | None = None, output: AudioOutput | None = None)[source]
Bases:
BaseModelAudio configuration for input and output.
- Parameters:
input – Configuration for input audio.
output – Configuration for output audio.
- input: AudioInput | None
- output: AudioOutput | None
- class pipecat.services.openai.realtime.events.SessionProperties(*, type: Literal['realtime'] | None = 'realtime', object: Literal['realtime.session'] | None = None, id: str | None = None, model: str | None = None, output_modalities: list[Literal['text', 'audio']] | None = None, instructions: str | None = None, audio: AudioConfiguration | None = None, tools: ToolsSchema | list[dict] | None = None, tool_choice: Literal['auto', 'none', 'required'] | None = None, max_output_tokens: int | Literal['inf'] | None = None, tracing: Literal['auto'] | dict | None = None, prompt: dict | None = None, expires_at: int | None = None, include: list[str] | None = None)[source]
Bases:
BaseModelConfiguration properties for an OpenAI Realtime session.
- Parameters:
type – The type of session, always “realtime”.
object – Object type identifier, always “realtime.session”.
id – Unique identifier for the session.
model – The Realtime model used for this session. Note: The model is set at connection time via model arg in __init__ and cannot be changed during the session.
output_modalities – The set of modalities the model can respond with.
instructions – System instructions for the assistant.
audio – Configuration for input and output audio.
tools – Available function tools for the assistant.
tool_choice – Tool usage strategy (“auto”, “none”, or “required”).
max_output_tokens – Maximum tokens in response or “inf” for unlimited.
tracing – Configuration options for tracing.
prompt – Reference to a prompt template and its variables.
expires_at – Session expiration timestamp.
include – Additional fields to include in server outputs.
- type: Literal['realtime'] | None
- object: Literal['realtime.session'] | None
- id: str | None
- model: str | None
- output_modalities: list[Literal['text', 'audio']] | None
- instructions: str | None
- audio: AudioConfiguration | None
- tools: ToolsSchema | list[dict] | None
- tool_choice: Literal['auto', 'none', 'required'] | None
- max_output_tokens: int | Literal['inf'] | None
- tracing: Literal['auto'] | dict | None
- prompt: dict | None
- expires_at: int | None
- include: list[str] | None
- class pipecat.services.openai.realtime.events.ItemContent(*, type: Literal['text', 'audio', 'input_text', 'input_audio', 'input_image', 'output_text', 'output_audio'], text: str | None = None, audio: str | None = None, transcript: str | None = None, image_url: str | None = None, detail: Literal['auto', 'low', 'high'] | None = None)[source]
Bases:
BaseModelContent within a conversation item.
- Parameters:
type – Content type (text, audio, input_text, input_audio, input_image, output_text, or output_audio).
text – Text content for text-based items.
audio – Base64-encoded audio data for audio items.
transcript – Transcribed text for audio items.
image_url – Base64-encoded image data as a data URI for input_image items.
detail – Detail level for image processing (“auto”, “low”, or “high”).
- type: Literal['text', 'audio', 'input_text', 'input_audio', 'input_image', 'output_text', 'output_audio']
- text: str | None
- audio: str | None
- transcript: str | None
- image_url: str | None
- detail: Literal['auto', 'low', 'high'] | None
- class pipecat.services.openai.realtime.events.ConversationItem(*, id: str = <factory>, object: ~typing.Literal['realtime.item'] | None = None, type: ~typing.Literal['message', 'function_call', 'function_call_output'], status: ~typing.Literal['completed', 'in_progress', 'incomplete'] | None = None, role: ~typing.Literal['user', 'assistant', 'system'] | None = None, content: list[~pipecat.services.openai.realtime.events.ItemContent] | None = None, call_id: str | None = None, name: str | None = None, arguments: str | None = None, output: str | None = None)[source]
Bases:
BaseModelA conversation item in the realtime session.
- Parameters:
id – Unique identifier for the item, auto-generated if not provided.
object – Object type identifier for the realtime API.
type – Item type (message, function_call, or function_call_output).
status – Current status of the item.
role – Speaker role for message items (user, assistant, or system).
content – Content list for message items.
call_id – Function call identifier for function_call items.
name – Function name for function_call items.
arguments – Function arguments as JSON string for function_call items.
output – Function output as JSON string for function_call_output items.
- id: str
- object: Literal['realtime.item'] | None
- type: Literal['message', 'function_call', 'function_call_output']
- status: Literal['completed', 'in_progress', 'incomplete'] | None
- role: Literal['user', 'assistant', 'system'] | None
- content: list[ItemContent] | None
- call_id: str | None
- name: str | None
- arguments: str | None
- output: str | None
- class pipecat.services.openai.realtime.events.RealtimeConversation(*, id: str, object: Literal['realtime.conversation'])[source]
Bases:
BaseModelA realtime conversation session.
- Parameters:
id – Unique identifier for the conversation.
object – Object type identifier, always “realtime.conversation”.
- id: str
- object: Literal['realtime.conversation']
- class pipecat.services.openai.realtime.events.ResponseProperties(*, output_modalities: list[Literal['text', 'audio']] | None = ['audio'], instructions: str | None = None, audio: AudioConfiguration | None = None, tools: list[dict] | None = None, tool_choice: Literal['auto', 'none', 'required'] | None = None, temperature: float | None = None, max_output_tokens: int | Literal['inf'] | None = None)[source]
Bases:
BaseModelProperties for configuring assistant responses.
- Parameters:
output_modalities – Output modalities for the response. Must be either [“text”] or [“audio”]. Defaults to [“audio”].
instructions – Specific instructions for this response.
audio – Audio configuration for this response.
tools – Available tools for this response.
tool_choice – Tool usage strategy for this response.
temperature – Sampling temperature for this response.
max_output_tokens – Maximum tokens for this response.
- output_modalities: list[Literal['text', 'audio']] | None
- instructions: str | None
- audio: AudioConfiguration | None
- tools: list[dict] | None
- tool_choice: Literal['auto', 'none', 'required'] | None
- temperature: float | None
- max_output_tokens: int | Literal['inf'] | None
- class pipecat.services.openai.realtime.events.RealtimeError(*, type: str, code: str | None = '', message: str, param: str | None = None, event_id: str | None = None)[source]
Bases:
BaseModelError information from the realtime API.
- Parameters:
type – Error type identifier.
code – Specific error code.
message – Human-readable error message.
param – Parameter name that caused the error, if applicable.
event_id – Event ID associated with the error, if applicable.
- type: str
- code: str | None
- message: str
- param: str | None
- event_id: str | None
- class pipecat.services.openai.realtime.events.ClientEvent(*, event_id: str = <factory>)[source]
Bases:
BaseModelBase class for client events sent to the realtime API.
- Parameters:
event_id – Unique identifier for the event, auto-generated if not provided.
- event_id: str
- class pipecat.services.openai.realtime.events.SessionUpdateEvent(*, event_id: str = <factory>, type: Literal['session.update'] = 'session.update', session: SessionProperties)[source]
Bases:
ClientEventEvent to update session properties.
- Parameters:
type – Event type, always “session.update”.
session – Updated session properties.
- type: Literal['session.update']
- session: SessionProperties
- model_dump(*args, **kwargs) dict[str, Any][source]
Serialize the event to a dictionary.
Handles special serialization for turn_detection where False becomes null.
- Parameters:
*args – Positional arguments passed to parent model_dump.
**kwargs – Keyword arguments passed to parent model_dump.
- Returns:
Dictionary representation of the event.
- class pipecat.services.openai.realtime.events.InputAudioBufferAppendEvent(*, event_id: str = <factory>, type: Literal['input_audio_buffer.append'] = 'input_audio_buffer.append', audio: str)[source]
Bases:
ClientEventEvent to append audio data to the input buffer.
- Parameters:
type – Event type, always “input_audio_buffer.append”.
audio – Base64-encoded audio data to append.
- type: Literal['input_audio_buffer.append']
- audio: str
- class pipecat.services.openai.realtime.events.InputAudioBufferCommitEvent(*, event_id: str = <factory>, type: Literal['input_audio_buffer.commit'] = 'input_audio_buffer.commit')[source]
Bases:
ClientEventEvent to commit the current input audio buffer.
- Parameters:
type – Event type, always “input_audio_buffer.commit”.
- type: Literal['input_audio_buffer.commit']
- class pipecat.services.openai.realtime.events.InputAudioBufferClearEvent(*, event_id: str = <factory>, type: Literal['input_audio_buffer.clear'] = 'input_audio_buffer.clear')[source]
Bases:
ClientEventEvent to clear the input audio buffer.
- Parameters:
type – Event type, always “input_audio_buffer.clear”.
- type: Literal['input_audio_buffer.clear']
- class pipecat.services.openai.realtime.events.ConversationItemCreateEvent(*, event_id: str = <factory>, type: Literal['conversation.item.create'] = 'conversation.item.create', previous_item_id: str | None = None, item: ConversationItem)[source]
Bases:
ClientEventEvent to create a new conversation item.
- Parameters:
type – Event type, always “conversation.item.create”.
previous_item_id – ID of the item to insert after, if any.
item – The conversation item to create.
- type: Literal['conversation.item.create']
- previous_item_id: str | None
- item: ConversationItem
- class pipecat.services.openai.realtime.events.ConversationItemTruncateEvent(*, event_id: str = <factory>, type: Literal['conversation.item.truncate'] = 'conversation.item.truncate', item_id: str, content_index: int, audio_end_ms: int)[source]
Bases:
ClientEventEvent to truncate a conversation item’s audio content.
- Parameters:
type – Event type, always “conversation.item.truncate”.
item_id – ID of the item to truncate.
content_index – Index of the content to truncate within the item.
audio_end_ms – End time in milliseconds for the truncated audio.
- type: Literal['conversation.item.truncate']
- item_id: str
- content_index: int
- audio_end_ms: int
- class pipecat.services.openai.realtime.events.ConversationItemDeleteEvent(*, event_id: str = <factory>, type: Literal['conversation.item.delete'] = 'conversation.item.delete', item_id: str)[source]
Bases:
ClientEventEvent to delete a conversation item.
- Parameters:
type – Event type, always “conversation.item.delete”.
item_id – ID of the item to delete.
- type: Literal['conversation.item.delete']
- item_id: str
- class pipecat.services.openai.realtime.events.ConversationItemRetrieveEvent(*, event_id: str = <factory>, type: Literal['conversation.item.retrieve'] = 'conversation.item.retrieve', item_id: str)[source]
Bases:
ClientEventEvent to retrieve a conversation item by ID.
- Parameters:
type – Event type, always “conversation.item.retrieve”.
item_id – ID of the item to retrieve.
- type: Literal['conversation.item.retrieve']
- item_id: str
- class pipecat.services.openai.realtime.events.ResponseCreateEvent(*, event_id: str = <factory>, type: Literal['response.create'] = 'response.create', response: ResponseProperties | None = None)[source]
Bases:
ClientEventEvent to create a new assistant response.
- Parameters:
type – Event type, always “response.create”.
response – Optional response configuration properties.
- type: Literal['response.create']
- response: ResponseProperties | None
- class pipecat.services.openai.realtime.events.ResponseCancelEvent(*, event_id: str = <factory>, type: Literal['response.cancel'] = 'response.cancel')[source]
Bases:
ClientEventEvent to cancel the current assistant response.
- Parameters:
type – Event type, always “response.cancel”.
- type: Literal['response.cancel']
- class pipecat.services.openai.realtime.events.ServerEvent(*, event_id: str, type: str)[source]
Bases:
BaseModelBase class for server events received from the realtime API.
- Parameters:
event_id – Unique identifier for the event.
type – Type of the server event.
- event_id: str
- type: str
- class pipecat.services.openai.realtime.events.SessionCreatedEvent(*, event_id: str, type: Literal['session.created'], session: SessionProperties)[source]
Bases:
ServerEventEvent indicating a session has been created.
- Parameters:
type – Event type, always “session.created”.
session – The created session properties.
- type: Literal['session.created']
- session: SessionProperties
- class pipecat.services.openai.realtime.events.SessionUpdatedEvent(*, event_id: str, type: Literal['session.updated'], session: SessionProperties)[source]
Bases:
ServerEventEvent indicating a session has been updated.
- Parameters:
type – Event type, always “session.updated”.
session – The updated session properties.
- type: Literal['session.updated']
- session: SessionProperties
- class pipecat.services.openai.realtime.events.ConversationCreated(*, event_id: str, type: Literal['conversation.created'], conversation: RealtimeConversation)[source]
Bases:
ServerEventEvent indicating a conversation has been created.
- Parameters:
type – Event type, always “conversation.created”.
conversation – The created conversation.
- type: Literal['conversation.created']
- conversation: RealtimeConversation
- class pipecat.services.openai.realtime.events.ConversationItemAdded(*, event_id: str, type: Literal['conversation.item.added'], previous_item_id: str | None = None, item: ConversationItem)[source]
Bases:
ServerEventEvent indicating a conversation item has been added.
- Parameters:
type – Event type, always “conversation.item.added”.
previous_item_id – ID of the previous item, if any.
item – The added conversation item.
- type: Literal['conversation.item.added']
- previous_item_id: str | None
- item: ConversationItem
- class pipecat.services.openai.realtime.events.ConversationItemDone(*, event_id: str, type: Literal['conversation.item.done'], previous_item_id: str | None = None, item: ConversationItem)[source]
Bases:
ServerEventEvent indicating a conversation item is done processing.
- Parameters:
type – Event type, always “conversation.item.done”.
previous_item_id – ID of the previous item, if any.
item – The completed conversation item.
- type: Literal['conversation.item.done']
- previous_item_id: str | None
- item: ConversationItem
- class pipecat.services.openai.realtime.events.ConversationItemInputAudioTranscriptionDelta(*, event_id: str, type: Literal['conversation.item.input_audio_transcription.delta'], item_id: str, content_index: int, delta: str)[source]
Bases:
ServerEventEvent containing incremental input audio transcription.
- Parameters:
type – Event type, always “conversation.item.input_audio_transcription.delta”.
item_id – ID of the conversation item being transcribed.
content_index – Index of the content within the item.
delta – Incremental transcription text.
- type: Literal['conversation.item.input_audio_transcription.delta']
- item_id: str
- content_index: int
- delta: str
- class pipecat.services.openai.realtime.events.ConversationItemInputAudioTranscriptionCompleted(*, event_id: str, type: Literal['conversation.item.input_audio_transcription.completed'], item_id: str, content_index: int, transcript: str)[source]
Bases:
ServerEventEvent indicating input audio transcription is complete.
- Parameters:
type – Event type, always “conversation.item.input_audio_transcription.completed”.
item_id – ID of the conversation item that was transcribed.
content_index – Index of the content within the item.
transcript – Complete transcription text.
- type: Literal['conversation.item.input_audio_transcription.completed']
- item_id: str
- content_index: int
- transcript: str
- class pipecat.services.openai.realtime.events.ConversationItemInputAudioTranscriptionFailed(*, event_id: str, type: Literal['conversation.item.input_audio_transcription.failed'], item_id: str, content_index: int, error: RealtimeError)[source]
Bases:
ServerEventEvent indicating input audio transcription failed.
- Parameters:
type – Event type, always “conversation.item.input_audio_transcription.failed”.
item_id – ID of the conversation item that failed transcription.
content_index – Index of the content within the item.
error – Error details for the transcription failure.
- type: Literal['conversation.item.input_audio_transcription.failed']
- item_id: str
- content_index: int
- error: RealtimeError
- class pipecat.services.openai.realtime.events.ConversationItemTruncated(*, event_id: str, type: Literal['conversation.item.truncated'], item_id: str, content_index: int, audio_end_ms: int)[source]
Bases:
ServerEventEvent indicating a conversation item has been truncated.
- Parameters:
type – Event type, always “conversation.item.truncated”.
item_id – ID of the truncated conversation item.
content_index – Index of the content within the item.
audio_end_ms – End time in milliseconds for the truncated audio.
- type: Literal['conversation.item.truncated']
- item_id: str
- content_index: int
- audio_end_ms: int
- class pipecat.services.openai.realtime.events.ConversationItemDeleted(*, event_id: str, type: Literal['conversation.item.deleted'], item_id: str)[source]
Bases:
ServerEventEvent indicating a conversation item has been deleted.
- Parameters:
type – Event type, always “conversation.item.deleted”.
item_id – ID of the deleted conversation item.
- type: Literal['conversation.item.deleted']
- item_id: str
- class pipecat.services.openai.realtime.events.ConversationItemRetrieved(*, event_id: str, type: Literal['conversation.item.retrieved'], item: ConversationItem)[source]
Bases:
ServerEventEvent containing a retrieved conversation item.
- Parameters:
type – Event type, always “conversation.item.retrieved”.
item – The retrieved conversation item.
- type: Literal['conversation.item.retrieved']
- item: ConversationItem
- class pipecat.services.openai.realtime.events.ResponseCreated(*, event_id: str, type: str)[source]
Bases:
ServerEventEvent indicating an assistant response has been created.
- Parameters:
type – Event type, always “response.created”.
response – The created response object.
- type: Literal['response.created']
- class pipecat.services.openai.realtime.events.ResponseDone(*, event_id: str, type: str)[source]
Bases:
ServerEventEvent indicating an assistant response is complete.
- Parameters:
type – Event type, always “response.done”.
response – The completed response object.
- type: Literal['response.done']
- class pipecat.services.openai.realtime.events.ResponseOutputItemAdded(*, event_id: str, type: Literal['response.output_item.added'], response_id: str, output_index: int, item: ConversationItem)[source]
Bases:
ServerEventEvent indicating an output item has been added to a response.
- Parameters:
type – Event type, always “response.output_item.added”.
response_id – ID of the response.
output_index – Index of the output item.
item – The added conversation item.
- type: Literal['response.output_item.added']
- response_id: str
- output_index: int
- item: ConversationItem
- class pipecat.services.openai.realtime.events.ResponseOutputItemDone(*, event_id: str, type: Literal['response.output_item.done'], response_id: str, output_index: int, item: ConversationItem)[source]
Bases:
ServerEventEvent indicating an output item is complete.
- Parameters:
type – Event type, always “response.output_item.done”.
response_id – ID of the response.
output_index – Index of the output item.
item – The completed conversation item.
- type: Literal['response.output_item.done']
- response_id: str
- output_index: int
- item: ConversationItem
- class pipecat.services.openai.realtime.events.ResponseContentPartAdded(*, event_id: str, type: Literal['response.content_part.added'], response_id: str, item_id: str, output_index: int, content_index: int, part: ItemContent)[source]
Bases:
ServerEventEvent indicating a content part has been added to a response.
- Parameters:
type – Event type, always “response.content_part.added”.
response_id – ID of the response.
item_id – ID of the conversation item.
output_index – Index of the output item.
content_index – Index of the content part.
part – The added content part.
- type: Literal['response.content_part.added']
- response_id: str
- item_id: str
- output_index: int
- content_index: int
- part: ItemContent
- class pipecat.services.openai.realtime.events.ResponseContentPartDone(*, event_id: str, type: Literal['response.content_part.done'], response_id: str, item_id: str, output_index: int, content_index: int, part: ItemContent)[source]
Bases:
ServerEventEvent indicating a content part is complete.
- Parameters:
type – Event type, always “response.content_part.done”.
response_id – ID of the response.
item_id – ID of the conversation item.
output_index – Index of the output item.
content_index – Index of the content part.
part – The completed content part.
- type: Literal['response.content_part.done']
- response_id: str
- item_id: str
- output_index: int
- content_index: int
- part: ItemContent
- class pipecat.services.openai.realtime.events.ResponseTextDelta(*, event_id: str, type: Literal['response.output_text.delta'], response_id: str, item_id: str, output_index: int, content_index: int, delta: str)[source]
Bases:
ServerEventEvent containing incremental text from a response.
- Parameters:
type – Event type, always “response.output_text.delta”.
response_id – ID of the response.
item_id – ID of the conversation item.
output_index – Index of the output item.
content_index – Index of the content part.
delta – Incremental text content.
- type: Literal['response.output_text.delta']
- response_id: str
- item_id: str
- output_index: int
- content_index: int
- delta: str
- class pipecat.services.openai.realtime.events.ResponseTextDone(*, event_id: str, type: Literal['response.output_text.done'], response_id: str, item_id: str, output_index: int, content_index: int, text: str)[source]
Bases:
ServerEventEvent indicating text content is complete.
- Parameters:
type – Event type, always “response.output_text.done”.
response_id – ID of the response.
item_id – ID of the conversation item.
output_index – Index of the output item.
content_index – Index of the content part.
text – Complete text content.
- type: Literal['response.output_text.done']
- response_id: str
- item_id: str
- output_index: int
- content_index: int
- text: str
- class pipecat.services.openai.realtime.events.ResponseAudioTranscriptDelta(*, event_id: str, type: Literal['response.output_audio_transcript.delta'], response_id: str, item_id: str, output_index: int, content_index: int, delta: str)[source]
Bases:
ServerEventEvent containing incremental audio transcript from a response.
- Parameters:
type – Event type, always “response.output_audio_transcript.delta”.
response_id – ID of the response.
item_id – ID of the conversation item.
output_index – Index of the output item.
content_index – Index of the content part.
delta – Incremental transcript text.
- type: Literal['response.output_audio_transcript.delta']
- response_id: str
- item_id: str
- output_index: int
- content_index: int
- delta: str
- class pipecat.services.openai.realtime.events.ResponseAudioTranscriptDone(*, event_id: str, type: Literal['response.output_audio_transcript.done'], response_id: str, item_id: str, output_index: int, content_index: int, transcript: str)[source]
Bases:
ServerEventEvent indicating audio transcript is complete.
- Parameters:
type – Event type, always “response.output_audio_transcript.done”.
response_id – ID of the response.
item_id – ID of the conversation item.
output_index – Index of the output item.
content_index – Index of the content part.
transcript – Complete transcript text.
- type: Literal['response.output_audio_transcript.done']
- response_id: str
- item_id: str
- output_index: int
- content_index: int
- transcript: str
- class pipecat.services.openai.realtime.events.ResponseAudioDelta(*, event_id: str, type: Literal['response.output_audio.delta'], response_id: str, item_id: str, output_index: int, content_index: int, delta: str)[source]
Bases:
ServerEventEvent containing incremental audio data from a response.
- Parameters:
type – Event type, always “response.output_audio.delta”.
response_id – ID of the response.
item_id – ID of the conversation item.
output_index – Index of the output item.
content_index – Index of the content part.
delta – Base64-encoded incremental audio data.
- type: Literal['response.output_audio.delta']
- response_id: str
- item_id: str
- output_index: int
- content_index: int
- delta: str
- class pipecat.services.openai.realtime.events.ResponseAudioDone(*, event_id: str, type: Literal['response.output_audio.done'], response_id: str, item_id: str, output_index: int, content_index: int)[source]
Bases:
ServerEventEvent indicating audio content is complete.
- Parameters:
type – Event type, always “response.output_audio.done”.
response_id – ID of the response.
item_id – ID of the conversation item.
output_index – Index of the output item.
content_index – Index of the content part.
- type: Literal['response.output_audio.done']
- response_id: str
- item_id: str
- output_index: int
- content_index: int
- class pipecat.services.openai.realtime.events.ResponseFunctionCallArgumentsDelta(*, event_id: str, type: Literal['response.function_call_arguments.delta'], response_id: str, item_id: str, output_index: int, call_id: str, delta: str)[source]
Bases:
ServerEventEvent containing incremental function call arguments.
- Parameters:
type – Event type, always “response.function_call_arguments.delta”.
response_id – ID of the response.
item_id – ID of the conversation item.
output_index – Index of the output item.
call_id – ID of the function call.
delta – Incremental function arguments as JSON.
- type: Literal['response.function_call_arguments.delta']
- response_id: str
- item_id: str
- output_index: int
- call_id: str
- delta: str
- class pipecat.services.openai.realtime.events.ResponseFunctionCallArgumentsDone(*, event_id: str, type: Literal['response.function_call_arguments.done'], response_id: str, item_id: str, output_index: int, call_id: str, arguments: str)[source]
Bases:
ServerEventEvent indicating function call arguments are complete.
- Parameters:
type – Event type, always “response.function_call_arguments.done”.
response_id – ID of the response.
item_id – ID of the conversation item.
output_index – Index of the output item.
call_id – ID of the function call.
arguments – Complete function arguments as JSON string.
- type: Literal['response.function_call_arguments.done']
- response_id: str
- item_id: str
- output_index: int
- call_id: str
- arguments: str
- class pipecat.services.openai.realtime.events.InputAudioBufferSpeechStarted(*, event_id: str, type: Literal['input_audio_buffer.speech_started'], audio_start_ms: int, item_id: str)[source]
Bases:
ServerEventEvent indicating speech has started in the input audio buffer.
- Parameters:
type – Event type, always “input_audio_buffer.speech_started”.
audio_start_ms – Start time of speech in milliseconds.
item_id – ID of the associated conversation item.
- type: Literal['input_audio_buffer.speech_started']
- audio_start_ms: int
- item_id: str
- class pipecat.services.openai.realtime.events.InputAudioBufferSpeechStopped(*, event_id: str, type: Literal['input_audio_buffer.speech_stopped'], audio_end_ms: int, item_id: str)[source]
Bases:
ServerEventEvent indicating speech has stopped in the input audio buffer.
- Parameters:
type – Event type, always “input_audio_buffer.speech_stopped”.
audio_end_ms – End time of speech in milliseconds.
item_id – ID of the associated conversation item.
- type: Literal['input_audio_buffer.speech_stopped']
- audio_end_ms: int
- item_id: str
- class pipecat.services.openai.realtime.events.InputAudioBufferCommitted(*, event_id: str, type: Literal['input_audio_buffer.committed'], previous_item_id: str | None = None, item_id: str)[source]
Bases:
ServerEventEvent indicating the input audio buffer has been committed.
- Parameters:
type – Event type, always “input_audio_buffer.committed”.
previous_item_id – ID of the previous item, if any.
item_id – ID of the committed conversation item.
- type: Literal['input_audio_buffer.committed']
- previous_item_id: str | None
- item_id: str
- class pipecat.services.openai.realtime.events.InputAudioBufferCleared(*, event_id: str, type: Literal['input_audio_buffer.cleared'])[source]
Bases:
ServerEventEvent indicating the input audio buffer has been cleared.
- Parameters:
type – Event type, always “input_audio_buffer.cleared”.
- type: Literal['input_audio_buffer.cleared']
- class pipecat.services.openai.realtime.events.ErrorEvent(*, event_id: str, type: Literal['error'], error: RealtimeError)[source]
Bases:
ServerEventEvent indicating an error occurred.
- Parameters:
type – Event type, always “error”.
error – Error details.
- type: Literal['error']
- error: RealtimeError
- class pipecat.services.openai.realtime.events.RateLimitsUpdated(*, event_id: str, type: Literal['rate_limits.updated'], rate_limits: list[dict[str, Any]])[source]
Bases:
ServerEventEvent indicating rate limits have been updated.
- Parameters:
type – Event type, always “rate_limits.updated”.
rate_limits – List of rate limit information.
- type: Literal['rate_limits.updated']
- rate_limits: list[dict[str, Any]]
- class pipecat.services.openai.realtime.events.CachedTokensDetails(*, text_tokens: int | None = 0, audio_tokens: int | None = 0)[source]
Bases:
BaseModelDetails about cached tokens.
- Parameters:
text_tokens – Number of cached text tokens.
audio_tokens – Number of cached audio tokens.
- text_tokens: int | None
- audio_tokens: int | None
- class pipecat.services.openai.realtime.events.TokenDetails(*, cached_tokens: int | None = 0, text_tokens: int | None = 0, audio_tokens: int | None = 0, cached_tokens_details: CachedTokensDetails | None = None, image_tokens: int | None = 0, **extra_data: Any)[source]
Bases:
BaseModelDetailed token usage information.
- Parameters:
cached_tokens – Number of cached tokens used. Defaults to 0.
text_tokens – Number of text tokens used. Defaults to 0.
audio_tokens – Number of audio tokens used. Defaults to 0.
cached_tokens_details – Detailed breakdown of cached tokens.
image_tokens – Number of image tokens used (for input only).
- cached_tokens: int | None
- text_tokens: int | None
- audio_tokens: int | None
- cached_tokens_details: CachedTokensDetails | None
- image_tokens: int | None
- class pipecat.services.openai.realtime.events.Usage(*, total_tokens: int, input_tokens: int, output_tokens: int, input_token_details: TokenDetails, output_token_details: TokenDetails)[source]
Bases:
BaseModelToken usage statistics for a response.
- Parameters:
total_tokens – Total number of tokens used.
input_tokens – Number of input tokens used.
output_tokens – Number of output tokens used.
input_token_details – Detailed breakdown of input token usage.
output_token_details – Detailed breakdown of output token usage.
- total_tokens: int
- input_tokens: int
- output_tokens: int
- input_token_details: TokenDetails
- output_token_details: TokenDetails
- class pipecat.services.openai.realtime.events.Response(*, id: str, object: Literal['realtime.response'], status: Literal['completed', 'in_progress', 'incomplete', 'cancelled', 'failed'], status_details: Any, output: list[ConversationItem], output_modalities: list[Literal['text', 'audio']] | None = None, max_output_tokens: int | Literal['inf'] | None = None, audio: AudioConfiguration | None = None, usage: Usage | None = None, voice: str | None = None, temperature: float | None = None, output_audio_format: str | None = None)[source]
Bases:
BaseModelA complete assistant response.
- Parameters:
id – Unique identifier for the response.
object – Object type, always “realtime.response”.
status – Current status of the response.
status_details – Additional status information.
output – List of conversation items in the response.
conversation_id – Which conversation the response is added to.
output_modalities – The set of modalities the model used to respond.
max_output_tokens – Maximum number of output tokens used.
audio – Audio configuration for the response.
usage – Token usage statistics for the response.
voice – The voice the model used to respond.
temperature – Sampling temperature used for the response.
output_audio_format – The format of output audio.
- id: str
- object: Literal['realtime.response']
- status: Literal['completed', 'in_progress', 'incomplete', 'cancelled', 'failed']
- status_details: Any
- output: list[ConversationItem]
- output_modalities: list[Literal['text', 'audio']] | None
- max_output_tokens: int | Literal['inf'] | None
- audio: AudioConfiguration | None
- voice: str | None
- temperature: float | None
- output_audio_format: str | None
- pipecat.services.openai.realtime.events.parse_server_event(str)[source]
Parse a server event from JSON string.
- Parameters:
str – JSON string containing the server event.
- Returns:
Parsed server event object of the appropriate type.
- Raises:
Exception – If the event type is unimplemented or parsing fails.