tts

Deepgram text-to-speech service for AWS SageMaker.

This module provides a Pipecat TTS service that connects to Deepgram models deployed on AWS SageMaker endpoints. Uses HTTP/2 bidirectional streaming for low-latency real-time speech synthesis with support for interruptions and streaming audio output.

class pipecat.services.deepgram.sagemaker.tts.DeepgramSageMakerTTSSettings(model: str | None | _NotGiven = <factory>, extra: dict[str, Any]=<factory>, voice: str | None | _NotGiven = <factory>, language: Language | str | None | _NotGiven = <factory>)[source]

Bases: TTSSettings

Settings for DeepgramSageMakerTTSService.

class pipecat.services.deepgram.sagemaker.tts.DeepgramSageMakerTTSService(*, endpoint_name: str, region: str, voice: str | None = None, sample_rate: int | None = None, encoding: str = 'linear16', settings: DeepgramSageMakerTTSSettings | None = None, **kwargs)[source]

Bases: TTSService

Deepgram text-to-speech service for AWS SageMaker.

Provides real-time speech synthesis using Deepgram models deployed on AWS SageMaker endpoints. Uses HTTP/2 bidirectional streaming for low-latency audio generation with support for interruptions via the Clear message.

Requirements:

Example:

tts = DeepgramSageMakerTTSService(
    endpoint_name="my-deepgram-tts-endpoint",
    region="us-east-2",
    settings=DeepgramSageMakerTTSService.Settings(
        voice="aura-2-helena-en",
    )
)
Settings

alias of DeepgramSageMakerTTSSettings

__init__(*, endpoint_name: str, region: str, voice: str | None = None, sample_rate: int | None = None, encoding: str = 'linear16', settings: DeepgramSageMakerTTSSettings | None = None, **kwargs)[source]

Initialize the Deepgram SageMaker TTS service.

Parameters:
  • endpoint_name – Name of the SageMaker endpoint with Deepgram TTS model deployed (e.g., “my-deepgram-tts-endpoint”).

  • region – AWS region where the endpoint is deployed (e.g., “us-east-2”).

  • voice

    Voice model to use for synthesis. Defaults to “aura-2-helena-en”.

    Deprecated since version 0.0.105: Use settings=DeepgramSageMakerTTSService.Settings(voice=...) instead.

  • sample_rate – Audio sample rate in Hz. If None, uses the value from StartFrame.

  • encoding – Audio encoding format. Defaults to “linear16”.

  • settings – Runtime-updatable settings. When provided alongside deprecated parameters, settings values take precedence.

  • **kwargs – Additional arguments passed to the parent TTSService.

can_generate_metrics() bool[source]

Check if this service can generate processing metrics.

Returns:

True, as Deepgram SageMaker TTS service supports metrics generation.

async start(frame: StartFrame)[source]

Start the Deepgram SageMaker TTS service.

Parameters:

frame – The start frame containing initialization parameters.

async stop(frame: EndFrame)[source]

Stop the Deepgram SageMaker TTS service.

Parameters:

frame – The end frame.

async cancel(frame: CancelFrame)[source]

Cancel the Deepgram SageMaker TTS service.

Parameters:

frame – The cancel frame.

async on_audio_context_interrupted(context_id: str)[source]

Called when an audio context is cancelled due to an interruption.

Parameters:

context_id – The ID of the audio context that was interrupted, or None if no context was active at the time.

async flush_audio(context_id: str | None = None)[source]

Flush any pending audio synthesis by sending Flush command.

This should be called when the LLM finishes a complete response to force generation of audio from Deepgram’s internal text buffer.

async run_tts(text: str, context_id: str) AsyncGenerator[Frame | None, None][source]

Generate speech from text using Deepgram TTS on SageMaker.

Parameters:
  • text – The text to synthesize into speech.

  • context_id – The context ID for tracking audio frames.

Yields:

Frame – TTSStartedFrame, then None (audio comes asynchronously via the response processor).