client
HeyGen implementation for Pipecat.
This module provides integration with the HeyGen platform for creating conversational AI applications with avatars. It manages conversation sessions and provides real-time audio/video streaming capabilities through the HeyGen API.
- class pipecat.services.heygen.client.ServiceType(*values)[source]
Bases:
EnumEnum for HeyGen service types.
- INTERACTIVE_AVATAR = 'INTERACTIVE_AVATAR'
- LIVE_AVATAR = 'LIVE_AVATAR'
- class pipecat.services.heygen.client.HeyGenCallbacks(*, on_connected: Callable[[], Awaitable[None]], on_participant_connected: Callable[[str], Awaitable[None]], on_participant_disconnected: Callable[[str], Awaitable[None]])[source]
Bases:
BaseModelCallback handlers for HeyGen events.
- Parameters:
on_connected – Called when the bot connects to the LiveKit room.
on_participant_connected – Called when a participant connects.
on_participant_disconnected – Called when a participant disconnects.
- on_connected: Callable[[], Awaitable[None]]
- on_participant_connected: Callable[[str], Awaitable[None]]
- on_participant_disconnected: Callable[[str], Awaitable[None]]
- class pipecat.services.heygen.client.HeyGenClient(*, api_key: str, session: ClientSession, params: TransportParams, session_request: LiveAvatarNewSessionRequest | NewSessionRequest | None = None, service_type: ServiceType | None = None, callbacks: HeyGenCallbacks, connect_as_user: bool = False)[source]
Bases:
objectA client for interacting with HeyGen’s Interactive Avatar Realtime API.
This client manages both WebSocket and LiveKit connections for real-time avatar streaming, handling bi-directional audio/video communication and avatar control. It implements the API defined in https://docs.heygen.com/docs/interactive-avatar-realtime-api
The client manages the following connections: 1. WebSocket connection for avatar control and audio streaming 2. LiveKit connection for receiving avatar video and audio
- Parameters:
HEY_GEN_SAMPLE_RATE (int) – The required sample rate for HeyGen’s audio processing (24000 Hz)
- __init__(*, api_key: str, session: ClientSession, params: TransportParams, session_request: LiveAvatarNewSessionRequest | NewSessionRequest | None = None, service_type: ServiceType | None = None, callbacks: HeyGenCallbacks, connect_as_user: bool = False) None[source]
Initialize the HeyGen client.
- Parameters:
api_key – HeyGen API key for authentication
session – HTTP client session for API requests
params – Transport configuration parameters
session_request – Configuration for the HeyGen session (optional)
service_type – Type of service to use
callbacks – Callback handlers for HeyGen events
connect_as_user – Whether to connect using the user token or not (default: False)
- async setup(setup: FrameProcessorSetup) None[source]
Setup the client and initialize the conversation.
Establishes a new session with HeyGen’s API if one doesn’t exist.
- Parameters:
setup – The frame processor setup configuration.
- async cleanup() None[source]
Cleanup client resources.
Closes the active HeyGen session and resets internal state.
- async start(frame: StartFrame) None[source]
Start the client and establish all necessary connections.
Initializes WebSocket and LiveKit connections using the provided configuration. Sets up audio processing with the specified sample rates.
- Parameters:
frame – Initial configuration frame containing audio parameters
- async stop() None[source]
Stop the client and terminate all connections.
Disconnects from WebSocket and LiveKit endpoints, and performs cleanup.
- async interrupt(event_id: str) None[source]
Interrupt the avatar’s current action.
Stops the current animation/speech and returns the avatar to idle state. Useful for handling user interruptions during avatar speech.
- async start_agent_listening() None[source]
Start the avatar’s listening animation.
Triggers visual cues indicating the avatar is listening to user input.
- async stop_agent_listening() None[source]
Stop the avatar’s listening animation.
Returns the avatar to idle state from listening state.
- transport_ready() None[source]
Indicates that the output transport is ready and able to receive frames.
- property out_sample_rate: int
Get the output sample rate.
- Returns:
The output sample rate in Hz.
- property in_sample_rate: int
Get the input sample rate.
- Returns:
The input sample rate in Hz.
- async agent_speak(audio: bytes, event_id: str) None[source]
Send audio data to the agent speak.
- Parameters:
audio – Audio data as raw bytes (will be base64 encoded)
event_id – Unique identifier for the event
- async agent_speak_end(event_id: str) None[source]
Send signaling that the agent has finished speaking.
- Parameters:
event_id – Unique identifier for the event