tts
Async text-to-speech service implementations.
- pipecat.services.asyncai.tts.language_to_async_language(language: Language) str | None[source]
Convert a Language enum to Async language code.
- Parameters:
language – The Language enum value to convert.
- Returns:
The corresponding Async language code, or None if not supported.
- class pipecat.services.asyncai.tts.AsyncAITTSSettings(model: str | None | _NotGiven = <factory>, extra: dict[str, Any]=<factory>, voice: str | None | _NotGiven = <factory>, language: Language | str | None | _NotGiven = <factory>)[source]
Bases:
TTSSettingsSettings for AsyncAITTSService and AsyncAIHttpTTSService.
- class pipecat.services.asyncai.tts.AsyncAITTSService(*, api_key: str, voice_id: str | None = None, version: str = 'v1', url: str = 'wss://api.async.com/text_to_speech/websocket/ws', model: str | None = None, sample_rate: int | None = None, encoding: str = 'pcm_s16le', container: str = 'raw', params: InputParams | None = None, settings: AsyncAITTSSettings | None = None, aggregate_sentences: bool | None = None, text_aggregation_mode: TextAggregationMode | None = None, **kwargs)[source]
Bases:
WebsocketTTSServiceAsync TTS service with WebSocket streaming.
Provides text-to-speech using Async’s streaming WebSocket API.
- Settings
alias of
AsyncAITTSSettings
- class InputParams(*, language: Language | None = None)[source]
Bases:
BaseModelInput parameters for Async TTS configuration.
Deprecated since version 0.0.105: Use
AsyncAITTSService.Settingsdirectly via thesettingsparameter instead.- Parameters:
language – Language to use for synthesis.
- __init__(*, api_key: str, voice_id: str | None = None, version: str = 'v1', url: str = 'wss://api.async.com/text_to_speech/websocket/ws', model: str | None = None, sample_rate: int | None = None, encoding: str = 'pcm_s16le', container: str = 'raw', params: InputParams | None = None, settings: AsyncAITTSSettings | None = None, aggregate_sentences: bool | None = None, text_aggregation_mode: TextAggregationMode | None = None, **kwargs)[source]
Initialize the Async TTS service.
- Parameters:
api_key – Async API key.
voice_id –
UUID of the voice to use for synthesis. See docs for a full list: https://docs.async.com/list-voices-16699698e0
Deprecated since version 0.0.105: Use
settings=AsyncAITTSService.Settings(voice=...)instead.version – Async API version.
url – WebSocket URL for Async TTS API.
model –
TTS model to use (e.g., “async_flash_v1.0”).
Deprecated since version 0.0.105: Use
settings=AsyncAITTSService.Settings(model=...)instead.sample_rate – Audio sample rate.
encoding – Audio encoding format.
container – Audio container format.
params –
Additional input parameters for voice customization.
Deprecated since version 0.0.105: Use
settings=AsyncAITTSService.Settings(...)instead.settings – Runtime-updatable settings. When provided alongside deprecated parameters,
settingsvalues take precedence.aggregate_sentences –
Deprecated. Use text_aggregation_mode instead.
Deprecated since version 0.0.104: Use
text_aggregation_modeinstead.text_aggregation_mode – How to aggregate text before synthesis.
**kwargs – Additional arguments passed to the parent service.
- can_generate_metrics() bool[source]
Check if this service can generate processing metrics.
- Returns:
True, as Async service supports metrics generation.
- language_to_service_language(language: Language) str | None[source]
Convert a Language enum to Async language format.
- Parameters:
language – The language to convert.
- Returns:
The Async-specific language code, or None if not supported.
- async start(frame: StartFrame)[source]
Start the Async TTS service.
- Parameters:
frame – The start frame containing initialization parameters.
- async cancel(frame: CancelFrame)[source]
Cancel the Async TTS service.
- Parameters:
frame – The cancel frame.
- async flush_audio(context_id: str | None = None)[source]
Flush any pending audio.
- Parameters:
context_id – The specific context to flush. If None, falls back to the currently active context.
- async push_frame(frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM)[source]
Push a frame downstream with special handling for stop conditions.
- Parameters:
frame – The frame to push.
direction – The direction to push the frame.
- async on_audio_context_interrupted(context_id: str)[source]
Close the Async AI context when the bot is interrupted.
- async on_audio_context_completed(context_id: str)[source]
Close the Async AI context after all audio has been played.
Async AI does not send a server-side signal when a context is exhausted, so Pipecat must explicitly close it with
close_context: Trueto free server-side resources.
- async run_tts(text: str, context_id: str) AsyncGenerator[Frame | None, None][source]
Generate speech from text using Async API websocket endpoint.
- Parameters:
text – The text to synthesize into speech.
context_id – The context ID for tracking audio frames.
- Yields:
Frame – Audio frames containing the synthesized speech.
- class pipecat.services.asyncai.tts.AsyncAIHttpTTSService(*, api_key: str, voice_id: str | None = None, aiohttp_session: ClientSession, model: str | None = None, url: str = 'https://api.async.com', version: str = 'v1', sample_rate: int | None = None, encoding: str = 'pcm_s16le', container: str = 'raw', params: InputParams | None = None, settings: AsyncAITTSSettings | None = None, **kwargs)[source]
Bases:
TTSServiceHTTP-based Async TTS service.
Provides text-to-speech using Async’s HTTP streaming API for simpler, non-WebSocket integration. Suitable for use cases where streaming WebSocket connection is not required or desired.
- Settings
alias of
AsyncAITTSSettings
- class InputParams(*, language: Language | None = None)[source]
Bases:
BaseModelInput parameters for Async API.
Deprecated since version 0.0.105: Use
AsyncAIHttpTTSService.Settingsdirectly via thesettingsparameter instead.- Parameters:
language – Language to use for synthesis.
- __init__(*, api_key: str, voice_id: str | None = None, aiohttp_session: ClientSession, model: str | None = None, url: str = 'https://api.async.com', version: str = 'v1', sample_rate: int | None = None, encoding: str = 'pcm_s16le', container: str = 'raw', params: InputParams | None = None, settings: AsyncAITTSSettings | None = None, **kwargs)[source]
Initialize the Async TTS service.
- Parameters:
api_key – Async API key.
voice_id –
ID of the voice to use for synthesis.
Deprecated since version 0.0.105: Use
settings=AsyncAIHttpTTSService.Settings(voice=...)instead.aiohttp_session – An aiohttp session for making HTTP requests.
model –
TTS model to use (e.g., “async_flash_v1.0”).
Deprecated since version 0.0.105: Use
settings=AsyncAIHttpTTSService.Settings(model=...)instead.url – Base URL for Async API.
version – API version string for Async API.
sample_rate – Audio sample rate.
encoding – Audio encoding format.
container – Audio container format.
params –
Additional input parameters for voice customization.
Deprecated since version 0.0.105: Use
settings=AsyncAIHttpTTSService.Settings(...)instead.settings – Runtime-updatable settings. When provided alongside deprecated parameters,
settingsvalues take precedence.**kwargs – Additional arguments passed to the parent TTSService.
- can_generate_metrics() bool[source]
Check if this service can generate processing metrics.
- Returns:
True, as Async HTTP service supports metrics generation.
- language_to_service_language(language: Language) str | None[source]
Convert a Language enum to Async language format.
- Parameters:
language – The language to convert.
- Returns:
The Async-specific language code, or None if not supported.
- async start(frame: StartFrame)[source]
Start the Async HTTP TTS service.
- Parameters:
frame – The start frame containing initialization parameters.
- async run_tts(text: str, context_id: str) AsyncGenerator[Frame | None, None][source]
Generate speech from text using Async’s HTTP streaming API.
- Parameters:
text – The text to synthesize into speech.
context_id – The context ID for tracking audio frames.
- Yields:
Frame – Audio frames containing the synthesized speech.