tts

AWS Polly text-to-speech service implementation.

This module provides integration with Amazon Polly for text-to-speech synthesis, supporting multiple languages, voices, and SSML features.

pipecat.services.aws.tts.language_to_aws_language(language: Language) str | None[source]

Convert a Language enum to AWS Polly language code.

Parameters:

language – The Language enum value to convert.

Returns:

The corresponding AWS Polly language code, or None if not supported.

class pipecat.services.aws.tts.AWSPollyTTSSettings(model: str | None | _NotGiven = <factory>, extra: dict[str, Any]=<factory>, voice: str | None | _NotGiven = <factory>, language: Language | str | None | _NotGiven = <factory>, engine: str | None | _NotGiven = <factory>, pitch: str | None | _NotGiven = <factory>, rate: str | None | _NotGiven = <factory>, volume: str | None | _NotGiven = <factory>, lexicon_names: list[str] | None | _NotGiven = <factory>)[source]

Bases: TTSSettings

Settings for AWSPollyTTSService.

Parameters:
  • engine – TTS engine to use (‘standard’, ‘neural’, etc.).

  • pitch – Voice pitch adjustment (for standard engine only).

  • rate – Speech rate adjustment.

  • volume – Voice volume adjustment.

  • lexicon_names – List of pronunciation lexicons to apply.

engine: str | None | _NotGiven
pitch: str | None | _NotGiven
rate: str | None | _NotGiven
volume: str | None | _NotGiven
lexicon_names: list[str] | None | _NotGiven
class pipecat.services.aws.tts.AWSPollyTTSService(*, api_key: str | None = None, aws_access_key_id: str | None = None, aws_session_token: str | None = None, region: str | None = None, voice_id: str | None = None, sample_rate: int | None = None, params: InputParams | None = None, settings: AWSPollyTTSSettings | None = None, **kwargs)[source]

Bases: TTSService

AWS Polly text-to-speech service.

Provides text-to-speech synthesis using Amazon Polly with support for multiple languages, voices, SSML features, and voice customization options including prosody controls.

Settings

alias of AWSPollyTTSSettings

class InputParams(*, engine: str | None = None, language: Language | None = Language.EN, pitch: str | None = None, rate: str | None = None, volume: str | None = None, lexicon_names: list[str] | None = None)[source]

Bases: BaseModel

Input parameters for AWS Polly TTS configuration.

Deprecated since version 0.0.105: Use AWSPollyTTSService.Settings directly via the settings parameter instead.

Parameters:
  • engine – TTS engine to use (‘standard’, ‘neural’, etc.).

  • language – Language for synthesis. Defaults to English.

  • pitch – Voice pitch adjustment (for standard engine only).

  • rate – Speech rate adjustment.

  • volume – Voice volume adjustment.

  • lexicon_names – List of pronunciation lexicons to apply.

engine: str | None
language: Language | None
pitch: str | None
rate: str | None
volume: str | None
lexicon_names: list[str] | None
__init__(*, api_key: str | None = None, aws_access_key_id: str | None = None, aws_session_token: str | None = None, region: str | None = None, voice_id: str | None = None, sample_rate: int | None = None, params: InputParams | None = None, settings: AWSPollyTTSSettings | None = None, **kwargs)[source]

Initializes the AWS Polly TTS service.

Parameters:
  • api_key – AWS secret access key. If None, uses AWS_SECRET_ACCESS_KEY environment variable.

  • aws_access_key_id – AWS access key ID. If None, uses AWS_ACCESS_KEY_ID environment variable.

  • aws_session_token – AWS session token for temporary credentials.

  • region – AWS region for Polly service. Defaults to ‘us-east-1’.

  • voice_id

    Voice ID to use for synthesis. Defaults to ‘Joanna’.

    Deprecated since version 0.0.105: Use settings=AWSPollyTTSService.Settings(voice=...) instead.

  • sample_rate – Audio sample rate. If None, uses service default.

  • params

    Additional input parameters for voice customization.

    Deprecated since version 0.0.105: Use settings=AWSPollyTTSService.Settings(...) instead.

  • settings – Runtime-updatable settings. When provided alongside deprecated parameters, settings values take precedence.

  • **kwargs – Additional arguments passed to parent TTSService class.

can_generate_metrics() bool[source]

Check if this service can generate processing metrics.

Returns:

True, as AWS Polly service supports metrics generation.

language_to_service_language(language: Language) str | None[source]

Convert a Language enum to AWS Polly language format.

Parameters:

language – The language to convert.

Returns:

The AWS Polly-specific language code, or None if not supported.

async run_tts(text: str, context_id: str) AsyncGenerator[Frame, None][source]

Generate speech from text using AWS Polly.

Parameters:
  • text – The text to synthesize into speech.

  • context_id – The context ID for tracking audio frames.

Yields:

Frame – Audio frames containing the synthesized speech.