llm
Inworld Realtime LLM service implementation with WebSocket support.
Based on Inworld’s Realtime API documentation: https://docs.inworld.ai/api-reference/realtimeAPI/realtime/realtime-websocket
- class pipecat.services.inworld.realtime.llm.CurrentAudioResponse(item_id: str, content_index: int, start_time_ms: int, total_size: int = 0)[source]
Bases:
objectTracks the current audio response from the assistant.
- Parameters:
item_id – Unique identifier for the audio response item.
content_index – Index of the audio content within the item.
start_time_ms – Timestamp when the audio response started in milliseconds.
total_size – Total size of audio data received in bytes. Defaults to 0.
- item_id: str
- content_index: int
- start_time_ms: int
- total_size: int = 0
- class pipecat.services.inworld.realtime.llm.InworldRealtimeLLMSettings(model: str | None | _NotGiven = <factory>, extra: dict[str, Any]=<factory>, system_instruction: str | None | _NotGiven = <factory>, temperature: float | None | _NotGiven = <factory>, max_tokens: int | None | _NotGiven = <factory>, top_p: float | None | _NotGiven = <factory>, top_k: int | None | _NotGiven = <factory>, frequency_penalty: float | None | _NotGiven = <factory>, presence_penalty: float | None | _NotGiven = <factory>, seed: int | None | _NotGiven = <factory>, filter_incomplete_user_turns: bool | None | _NotGiven = <factory>, user_turn_completion_config: UserTurnCompletionConfig | None | _NotGiven = <factory>, session_properties: SessionProperties | _NotGiven = <factory>)[source]
Bases:
LLMSettingsSettings for InworldRealtimeLLMService.
- Parameters:
session_properties – Inworld Realtime session properties (audio config, tools, etc.).
modelandinstructionsare synced bidirectionally with the top-levelmodelandsystem_instructionfields.
- session_properties: SessionProperties | _NotGiven
- apply_update(delta: InworldRealtimeLLMSettings) dict[str, Any][source]
Merge a delta, keeping
model/system_instructionin sync with SP.When the delta contains
session_properties, it replaces the stored SP wholesale (matching legacy behaviour). Top-level field values always take precedence over conflicting SP values.
- classmethod from_mapping(settings: Mapping[str, Any]) InworldRealtimeLLMSettings[source]
Build a delta from a plain dict, routing SP keys into
session_properties.Keys that correspond to
SessionPropertiesfields are collected into a nestedsession_propertiesvalue.modelis always routed to the top-level field. Unknown keys go toextra.
- class pipecat.services.inworld.realtime.llm.InworldRealtimeLLMService(*, api_key: str, llm_model: str | None = None, voice: str | None = None, tts_model: str | None = None, stt_model: str | None = None, base_url: str = 'wss://api.inworld.ai/api/v1/realtime/session', auth_type: Literal['basic', 'bearer'] = 'basic', settings: InworldRealtimeLLMSettings | None = None, start_audio_paused: bool = False, **kwargs)[source]
Bases:
LLMServiceInworld Realtime LLM service for real-time audio and text communication.
Implements the Inworld Realtime API with WebSocket communication for low-latency bidirectional audio and text interactions. The API operates as a cascade STT/LLM/TTS pipeline under the hood, with built-in semantic voice activity detection (VAD) for turn management.
Supports function calling, conversation management, and real-time transcription.
Example:
llm = InworldRealtimeLLMService( api_key=os.getenv("INWORLD_API_KEY"), llm_model="openai/gpt-4.1-nano", voice="Sarah", tts_model="inworld-tts-1.5-max", )
For full control over session properties (note:
session_propertiesreplaces all defaults, so provide a complete config):from pipecat.services.inworld.realtime.events import * llm = InworldRealtimeLLMService( api_key=os.getenv("INWORLD_API_KEY"), settings=InworldRealtimeLLMService.Settings( session_properties=SessionProperties( model="openai/gpt-4.1-nano", temperature=0.7, audio=AudioConfiguration( input=AudioInput( format=PCMAudioFormat(rate=24000), turn_detection=TurnDetection( type="semantic_vad", eagerness="low", ), ), output=AudioOutput( format=PCMAudioFormat(rate=24000), voice="Sarah", model="inworld-tts-1.5-max", ), ), ), ), )
- Settings
alias of
InworldRealtimeLLMSettings
- adapter_class
alias of
InworldRealtimeLLMAdapter
- __init__(*, api_key: str, llm_model: str | None = None, voice: str | None = None, tts_model: str | None = None, stt_model: str | None = None, base_url: str = 'wss://api.inworld.ai/api/v1/realtime/session', auth_type: Literal['basic', 'bearer'] = 'basic', settings: InworldRealtimeLLMSettings | None = None, start_audio_paused: bool = False, **kwargs)[source]
Initialize the Inworld Realtime LLM service.
- Parameters:
api_key – Inworld API key for authentication.
llm_model – LLM model to use (e.g. “openai/gpt-4.1-nano”). Shorthand for
session_properties.model.voice – Voice ID for TTS output (e.g. “Sarah”, “Clive”). Shorthand for
session_properties.audio.output.voice.tts_model – TTS model to use (e.g. “inworld-tts-1.5-max”). Shorthand for
session_properties.audio.output.model.stt_model – STT model for input transcription (e.g. “assemblyai/universal-streaming-multilingual”). Shorthand for
session_properties.audio.input.transcription.model.base_url – WebSocket base URL for the realtime API.
auth_type – Authentication type.
"basic"for server-side API key auth,"bearer"for client-side JWT auth.settings – Full settings for fine-grained control. When
session_propertiesis provided in settings, it replaces all defaults wholesale — provide a completeSessionPropertiesin that case.start_audio_paused – Whether to start with audio input paused.
**kwargs – Additional arguments passed to parent LLMService.
- set_audio_input_paused(paused: bool)[source]
Set whether audio input is paused.
- Parameters:
paused – True to pause audio input, False to resume.
- async start(frame: StartFrame)[source]
Start the service and establish WebSocket connection.
- async cancel(frame: CancelFrame)[source]
Cancel the service and close WebSocket connection.
- async process_frame(frame: Frame, direction: FrameDirection)[source]
Process incoming frames from the pipeline.
- async send_client_event(event: ClientEvent)[source]
Send a client event to the Inworld Realtime API.
- Parameters:
event – The client event to send.