transport

HeyGen implementation for Pipecat.

This module provides integration with the HeyGen platform for creating conversational AI applications with avatars. It manages conversation sessions and provides real-time audio/video streaming capabilities through the HeyGen API.

The module consists of three main components: - HeyGenInputTransport: Handles incoming audio and events from HeyGen conversations - HeyGenOutputTransport: Manages outgoing audio and events to HeyGen conversations - HeyGenTransport: Main transport implementation that coordinates input/output transports

class pipecat.transports.heygen.transport.HeyGenInputTransport(client: HeyGenClient, params: TransportParams, **kwargs)[source]

Bases: BaseInputTransport

Input transport for receiving audio and events from HeyGen conversations.

Handles incoming audio streams from participants and manages audio capture from the Daily room connected to the HeyGen conversation.

__init__(client: HeyGenClient, params: TransportParams, **kwargs)[source]

Initialize the HeyGen input transport.

Parameters:
  • client – The HeyGen transport client instance.

  • params – Transport configuration parameters.

  • **kwargs – Additional arguments passed to parent class.

async setup(setup: FrameProcessorSetup)[source]

Setup the input transport.

Parameters:

setup – The frame processor setup configuration.

async cleanup()[source]

Cleanup input transport resources.

async start(frame: StartFrame)[source]

Start the input transport.

Parameters:

frame – The start frame containing initialization parameters.

async stop(frame: EndFrame)[source]

Stop the input transport.

Parameters:

frame – The end frame signaling transport shutdown.

async cancel(frame: CancelFrame)[source]

Cancel the input transport.

Parameters:

frame – The cancel frame signaling immediate cancellation.

async start_capturing_audio(participant_id: str)[source]

Start capturing audio from a participant.

Parameters:

participant_id – The participant to capture audio from.

class pipecat.transports.heygen.transport.HeyGenOutputTransport(client: HeyGenClient, params: TransportParams, **kwargs)[source]

Bases: BaseOutputTransport

Output transport for sending audio and events to HeyGen conversations.

Handles outgoing audio streams to participants and manages the custom audio track expected by the HeyGen platform.

__init__(client: HeyGenClient, params: TransportParams, **kwargs)[source]

Initialize the HeyGen output transport.

Parameters:
  • client – The HeyGen transport client instance.

  • params – Transport configuration parameters.

  • **kwargs – Additional arguments passed to parent class.

async setup(setup: FrameProcessorSetup)[source]

Setup the output transport.

Parameters:

setup – The frame processor setup configuration.

async cleanup()[source]

Cleanup output transport resources.

async start(frame: StartFrame)[source]

Start the output transport.

Parameters:

frame – The start frame containing initialization parameters.

async stop(frame: EndFrame)[source]

Stop the output transport.

Parameters:

frame – The end frame signaling transport shutdown.

async cancel(frame: CancelFrame)[source]

Cancel the output transport.

Parameters:

frame – The cancel frame signaling immediate cancellation.

async push_frame(frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM)[source]

Push a frame to the next processor in the pipeline.

Parameters:
  • frame – The frame to push.

  • direction – The direction to push the frame.

async process_frame(frame: Frame, direction: FrameDirection)[source]

Process frames and handle interruptions.

Handles various types of frames including interruption events and user speaking states. Updates the HeyGen client state based on the received frames.

Parameters:
  • frame – The frame to process

  • direction – The direction of frame flow in the pipeline

Note

Special handling is implemented for: - InterruptionFrame: Triggers interruption of current speech - UserStartedSpeakingFrame: Initiates agent listening mode - UserStoppedSpeakingFrame: Stops agent listening mode

async write_audio_frame(frame: OutputAudioRawFrame) bool[source]

Write an audio frame to the HeyGen transport.

Resamples audio to 24kHz if needed before sending.

Parameters:

frame – The audio frame to write.

class pipecat.transports.heygen.transport.HeyGenParams(*, audio_out_enabled: bool = True, audio_out_sample_rate: int | None = None, audio_out_channels: int = 1, audio_out_bitrate: int = 96000, audio_out_10ms_chunks: int = 4, audio_out_mixer: Mapping[str | None, ~pipecat.audio.mixers.base_audio_mixer.BaseAudioMixer] | None=None, audio_out_destinations: list[str] = <factory>, audio_out_end_silence_secs: int = 2, audio_out_auto_silence: bool = True, audio_in_enabled: bool = True, audio_in_sample_rate: int | None = None, audio_in_channels: int = 1, audio_in_filter: BaseAudioFilter | None = None, audio_in_stream_on_start: bool = True, audio_in_passthrough: bool = True, video_in_enabled: bool = False, video_out_enabled: bool = False, video_out_is_live: bool = False, video_out_width: int = 1024, video_out_height: int = 768, video_out_bitrate: int | None = None, video_out_framerate: int = 30, video_out_color_format: str = 'RGB', video_out_codec: str | None = None, video_out_destinations: list[str] = <factory>)[source]

Bases: TransportParams

Configuration parameters for the HeyGen transport.

Parameters:
  • audio_in_enabled – Whether to enable audio input from participants.

  • audio_out_enabled – Whether to enable audio output to participants.

audio_in_enabled: bool
audio_out_enabled: bool
class pipecat.transports.heygen.transport.HeyGenTransport(session: ClientSession, api_key: str, params: HeyGenParams = HeyGenParams(audio_out_enabled=True, audio_out_sample_rate=None, audio_out_channels=1, audio_out_bitrate=96000, audio_out_10ms_chunks=4, audio_out_mixer=None, audio_out_destinations=[], audio_out_end_silence_secs=2, audio_out_auto_silence=True, audio_in_enabled=True, audio_in_sample_rate=None, audio_in_channels=1, audio_in_filter=None, audio_in_stream_on_start=True, audio_in_passthrough=True, video_in_enabled=False, video_out_enabled=False, video_out_is_live=False, video_out_width=1024, video_out_height=768, video_out_bitrate=None, video_out_framerate=30, video_out_color_format='RGB', video_out_codec=None, video_out_destinations=[]), input_name: str | None = None, output_name: str | None = None, session_request: LiveAvatarNewSessionRequest | NewSessionRequest | None = None, service_type: ServiceType | None = None)[source]

Bases: BaseTransport

Transport implementation for HeyGen video calls.

When used, the Pipecat bot joins the same virtual room as the HeyGen Avatar and the user. This is achieved by using HeyGenTransport, which initiates the conversation via HeyGenApi and obtains a room URL that all participants connect to.

Event handlers available:

  • on_client_connected(transport, participant): Participant connected to the session

  • on_client_disconnected(transport, participant): Participant disconnected from the session

Example:

@transport.event_handler("on_client_connected")
async def on_client_connected(transport, participant):
    ...
__init__(session: ClientSession, api_key: str, params: HeyGenParams = HeyGenParams(audio_out_enabled=True, audio_out_sample_rate=None, audio_out_channels=1, audio_out_bitrate=96000, audio_out_10ms_chunks=4, audio_out_mixer=None, audio_out_destinations=[], audio_out_end_silence_secs=2, audio_out_auto_silence=True, audio_in_enabled=True, audio_in_sample_rate=None, audio_in_channels=1, audio_in_filter=None, audio_in_stream_on_start=True, audio_in_passthrough=True, video_in_enabled=False, video_out_enabled=False, video_out_is_live=False, video_out_width=1024, video_out_height=768, video_out_bitrate=None, video_out_framerate=30, video_out_color_format='RGB', video_out_codec=None, video_out_destinations=[]), input_name: str | None = None, output_name: str | None = None, session_request: LiveAvatarNewSessionRequest | NewSessionRequest | None = None, service_type: ServiceType | None = None)[source]

Initialize the HeyGen transport.

Sets up a new HeyGen transport instance with the specified configuration for handling video calls between the Pipecat bot and HeyGen Avatar.

Parameters:
  • session – aiohttp session for making async HTTP requests

  • api_key – HeyGen API key for authentication

  • params – HeyGen-specific configuration parameters (default: HeyGenParams())

  • input_name – Optional custom name for the input transport

  • output_name – Optional custom name for the output transport

  • session_request – Configuration for the HeyGen session

  • service_type – Service type for the avatar session

Note

The transport will automatically join the same virtual room as the HeyGen Avatar and user through the HeyGenClient, which handles session initialization via HeyGenApi.

input() FrameProcessor[source]

Get the input transport for receiving media and events.

Returns:

The HeyGen input transport instance.

output() FrameProcessor[source]

Get the output transport for sending media and events.

Returns:

The HeyGen output transport instance.