video
HeyGen implementation for Pipecat.
This module provides integration with the HeyGen platform for creating conversational AI applications with avatars. It manages conversation sessions and provides real-time audio/video streaming capabilities through the HeyGen API.
- class pipecat.services.heygen.video.HeyGenVideoSettings(model: str | None | _NotGiven = <factory>, extra: dict[str, Any]=<factory>)[source]
Bases:
ServiceSettingsSettings for the HeyGen video service.
- class pipecat.services.heygen.video.HeyGenVideoService(*, api_key: str, session: ClientSession, session_request: LiveAvatarNewSessionRequest | NewSessionRequest | None = None, service_type: ServiceType | None = None, settings: HeyGenVideoSettings | None = None, **kwargs)[source]
Bases:
AIServiceA service that integrates HeyGen’s interactive avatar capabilities into the pipeline.
This service manages the lifecycle of a HeyGen avatar session by handling bidirectional audio/video streaming, avatar animations, and user interactions. It processes various frame types to coordinate the avatar’s behavior and maintains synchronization between audio and video streams.
The service supports:
Real-time avatar animation based on audio input
Voice activity detection for natural interactions
Interrupt handling for more natural conversations
Audio resampling for optimal quality
Automatic session management
- Parameters:
api_key (str) – HeyGen API key for authentication
session (aiohttp.ClientSession) – HTTP client session for API requests
session_request (NewSessionRequest, optional) – Configuration for the HeyGen session. Defaults to using the “Shawn_Therapist_public” avatar with “v2” version.
- Settings
alias of
HeyGenVideoSettings
- __init__(*, api_key: str, session: ClientSession, session_request: LiveAvatarNewSessionRequest | NewSessionRequest | None = None, service_type: ServiceType | None = None, settings: HeyGenVideoSettings | None = None, **kwargs) None[source]
Initialize the HeyGen video service.
- Parameters:
api_key – HeyGen API key for authentication
session – HTTP client session for API requests
session_request – Configuration for the HeyGen session
service_type – Service type for the avatar session
settings – Runtime-updatable settings. HeyGen has no model concept, so this is primarily used for the
extradict.**kwargs – Additional arguments passed to parent AIService
- async setup(setup: FrameProcessorSetup)[source]
Set up the HeyGen video service with necessary configuration.
Initializes the HeyGen client, establishes connections, and prepares the service for audio/video processing. This includes setting up audio/video streams, configuring callbacks, and initializing the resampler.
- Parameters:
setup – Configuration parameters for the frame processor.
- async cleanup()[source]
Clean up the service and release resources.
Terminates the HeyGen client session and cleans up associated resources.
- async start(frame: StartFrame)[source]
Start the HeyGen video service and initialize the avatar session.
Creates necessary tasks for audio/video processing and establishes the connection with the HeyGen service.
- Parameters:
frame – The start frame containing initialization parameters.
- async stop(frame: EndFrame)[source]
Stop the HeyGen video service gracefully.
Performs cleanup by ending the conversation and cancelling ongoing tasks in a controlled manner.
- Parameters:
frame – The end frame.
- async cancel(frame: CancelFrame)[source]
Cancel the HeyGen video service.
Performs an immediate termination of the service, cleaning up resources without waiting for ongoing operations to complete.
- Parameters:
frame – The cancel frame.
- async process_frame(frame: Frame, direction: FrameDirection)[source]
Process incoming frames and coordinate avatar behavior.
Handles different types of frames to manage avatar interactions: - UserStartedSpeakingFrame: Activates avatar’s listening animation - UserStoppedSpeakingFrame: Deactivates avatar’s listening state - TTSAudioRawFrame: Processes audio for avatar speech - Other frames: Forwards them through the pipeline
- Parameters:
frame – The frame to be processed.
direction – The direction of frame processing (input/output).