tts

AWS Polly text-to-speech service implementation.

This module provides integration with Amazon Polly for text-to-speech synthesis, supporting multiple languages, voices, and SSML features.

pipecat.services.aws.tts.language_to_aws_language(language: Language) → str | None[source]

Convert a Language enum to AWS Polly language code.

Parameters:: language – The Language enum value to convert.
Returns:: The corresponding AWS Polly language code, or None if not supported.

Bases: TTSSettings

Settings for AWSPollyTTSService.

Parameters:

engine – TTS engine to use (‘standard’, ‘neural’, etc.).
pitch – Voice pitch adjustment (for standard engine only).
rate – Speech rate adjustment.
volume – Voice volume adjustment.
lexicon_names – List of pronunciation lexicons to apply.

engine: str | None | _NotGiven

pitch: str | None | _NotGiven

rate: str | None | _NotGiven

volume: str | None | _NotGiven

lexicon_names: list[str] | None | _NotGiven

Bases: TTSService

AWS Polly text-to-speech service.

Provides text-to-speech synthesis using Amazon Polly with support for multiple languages, voices, SSML features, and voice customization options including prosody controls.

Settings: alias of AWSPollyTTSSettings

Bases: BaseModel

Input parameters for AWS Polly TTS configuration.

Deprecated since version 0.0.105: Use AWSPollyTTSService.Settings directly via the settings parameter instead.

Parameters:

engine – TTS engine to use (‘standard’, ‘neural’, etc.).
language – Language for synthesis. Defaults to English.
pitch – Voice pitch adjustment (for standard engine only).
rate – Speech rate adjustment.
volume – Voice volume adjustment.
lexicon_names – List of pronunciation lexicons to apply.

engine: str | None

language: Language | None

pitch: str | None

rate: str | None

volume: str | None

lexicon_names: list[str] | None

Initializes the AWS Polly TTS service.

Parameters:

api_key – AWS secret access key. If None, uses AWS_SECRET_ACCESS_KEY environment variable.
aws_access_key_id – AWS access key ID. If None, uses AWS_ACCESS_KEY_ID environment variable.
aws_session_token – AWS session token for temporary credentials.
region – AWS region for Polly service. Defaults to ‘us-east-1’.
voice_id –
Voice ID to use for synthesis. Defaults to ‘Joanna’.

Deprecated since version 0.0.105: Use settings=AWSPollyTTSService.Settings(voice=...) instead.
sample_rate – Audio sample rate. If None, uses service default.
params –
Additional input parameters for voice customization.

Deprecated since version 0.0.105: Use settings=AWSPollyTTSService.Settings(...) instead.
settings – Runtime-updatable settings. When provided alongside deprecated parameters, settings values take precedence.
**kwargs – Additional arguments passed to parent TTSService class.

can_generate_metrics() → bool[source]

Check if this service can generate processing metrics.

Returns:: True, as AWS Polly service supports metrics generation.

language_to_service_language(language: Language) → str | None[source]

Convert a Language enum to AWS Polly language format.

Parameters:: language – The language to convert.
Returns:: The AWS Polly-specific language code, or None if not supported.

async run_tts(text: str, context_id: str) → AsyncGenerator[Frame, None][source]

Generate speech from text using AWS Polly.

Parameters:

text – The text to synthesize into speech.
context_id – The context ID for tracking audio frames.

Yields:

Frame – Audio frames containing the synthesized speech.