tts
Deepgram text-to-speech service for AWS SageMaker.
This module provides a Pipecat TTS service that connects to Deepgram models deployed on AWS SageMaker endpoints. Uses HTTP/2 bidirectional streaming for low-latency real-time speech synthesis with support for interruptions and streaming audio output.
- class pipecat.services.deepgram.sagemaker.tts.DeepgramSageMakerTTSSettings(model: str | None | _NotGiven = <factory>, extra: dict[str, Any]=<factory>, voice: str | None | _NotGiven = <factory>, language: Language | str | None | _NotGiven = <factory>)[source]
Bases:
TTSSettingsSettings for DeepgramSageMakerTTSService.
- class pipecat.services.deepgram.sagemaker.tts.DeepgramSageMakerTTSService(*, endpoint_name: str, region: str, voice: str | None = None, sample_rate: int | None = None, encoding: str = 'linear16', settings: DeepgramSageMakerTTSSettings | None = None, **kwargs)[source]
Bases:
TTSServiceDeepgram text-to-speech service for AWS SageMaker.
Provides real-time speech synthesis using Deepgram models deployed on AWS SageMaker endpoints. Uses HTTP/2 bidirectional streaming for low-latency audio generation with support for interruptions via the Clear message.
Requirements:
AWS credentials configured (via environment variables, AWS CLI, or instance metadata)
A deployed SageMaker endpoint with Deepgram TTS model: https://developers.deepgram.com/docs/deploy-amazon-sagemaker
pipecat-ai[sagemaker]installed
Example:
tts = DeepgramSageMakerTTSService( endpoint_name="my-deepgram-tts-endpoint", region="us-east-2", settings=DeepgramSageMakerTTSService.Settings( voice="aura-2-helena-en", ) )
- Settings
alias of
DeepgramSageMakerTTSSettings
- __init__(*, endpoint_name: str, region: str, voice: str | None = None, sample_rate: int | None = None, encoding: str = 'linear16', settings: DeepgramSageMakerTTSSettings | None = None, **kwargs)[source]
Initialize the Deepgram SageMaker TTS service.
- Parameters:
endpoint_name – Name of the SageMaker endpoint with Deepgram TTS model deployed (e.g., “my-deepgram-tts-endpoint”).
region – AWS region where the endpoint is deployed (e.g., “us-east-2”).
voice –
Voice model to use for synthesis. Defaults to “aura-2-helena-en”.
Deprecated since version 0.0.105: Use
settings=DeepgramSageMakerTTSService.Settings(voice=...)instead.sample_rate – Audio sample rate in Hz. If None, uses the value from StartFrame.
encoding – Audio encoding format. Defaults to “linear16”.
settings – Runtime-updatable settings. When provided alongside deprecated parameters,
settingsvalues take precedence.**kwargs – Additional arguments passed to the parent TTSService.
- can_generate_metrics() bool[source]
Check if this service can generate processing metrics.
- Returns:
True, as Deepgram SageMaker TTS service supports metrics generation.
- async start(frame: StartFrame)[source]
Start the Deepgram SageMaker TTS service.
- Parameters:
frame – The start frame containing initialization parameters.
- async stop(frame: EndFrame)[source]
Stop the Deepgram SageMaker TTS service.
- Parameters:
frame – The end frame.
- async cancel(frame: CancelFrame)[source]
Cancel the Deepgram SageMaker TTS service.
- Parameters:
frame – The cancel frame.
- async on_audio_context_interrupted(context_id: str)[source]
Called when an audio context is cancelled due to an interruption.
- Parameters:
context_id – The ID of the audio context that was interrupted, or
Noneif no context was active at the time.
- async flush_audio(context_id: str | None = None)[source]
Flush any pending audio synthesis by sending Flush command.
This should be called when the LLM finishes a complete response to force generation of audio from Deepgram’s internal text buffer.
- async run_tts(text: str, context_id: str) AsyncGenerator[Frame | None, None][source]
Generate speech from text using Deepgram TTS on SageMaker.
- Parameters:
text – The text to synthesize into speech.
context_id – The context ID for tracking audio frames.
- Yields:
Frame – TTSStartedFrame, then None (audio comes asynchronously via the response processor).