tts

Piper TTS service implementation.

Bases: TTSSettings

Settings for PiperTTSService.

class pipecat.services.piper.tts.PiperTTSService(*, voice_id: str | None = None, download_dir: Path | None = None, force_redownload: bool = False, use_cuda: bool = False, settings: PiperTTSSettings | None = None, **kwargs)[source]

Bases: TTSService

Piper TTS service implementation.

Provides local text-to-speech synthesis using Piper voice models. Automatically downloads voice models if not already present and resamples audio output to match the configured sample rate.

Settings: alias of PiperTTSSettings

__init__(*, voice_id: str | None = None, download_dir: Path | None = None, force_redownload: bool = False, use_cuda: bool = False, settings: PiperTTSSettings | None = None, **kwargs)[source]

Initialize the Piper TTS service.

Parameters:

voice_id –
Piper voice model identifier (e.g. en_US-ryan-high).

Deprecated since version 0.0.105: Use settings=PiperTTSService.Settings(voice=...) instead.
download_dir – Directory for storing voice model files. Defaults to the current working directory.
force_redownload – Re-download the voice model even if it already exists.
use_cuda – Use CUDA for GPU-accelerated inference.
settings – Runtime-updatable settings. When provided alongside deprecated parameters, settings values take precedence.
**kwargs – Additional arguments passed to the parent TTSService.

can_generate_metrics() → bool[source]

Check if this service can generate processing metrics.

Returns:: True, as Piper service supports metrics generation.

async run_tts(text: str, context_id: str) → AsyncGenerator[Frame, None][source]

Generate speech from text using Piper.

Parameters:

text – The text to convert to speech.
context_id – Unique identifier for this TTS context.

Yields:

Frame – Audio frames containing the synthesized speech and status frames.

Bases: TTSSettings

Settings for PiperHttpTTSService.

class pipecat.services.piper.tts.PiperHttpTTSService(*, base_url: str, aiohttp_session: ClientSession, voice_id: str | None = None, settings: PiperHttpTTSSettings | None = None, **kwargs)[source]

Bases: TTSService

Piper HTTP TTS service implementation.

Provides integration with Piper’s HTTP TTS server for text-to-speech synthesis. Supports streaming audio generation with configurable sample rates and automatic WAV header removal.

Settings: alias of PiperHttpTTSSettings

__init__(*, base_url: str, aiohttp_session: ClientSession, voice_id: str | None = None, settings: PiperHttpTTSSettings | None = None, **kwargs)[source]

Initialize the Piper TTS service.

Parameters:

base_url – Base URL for the Piper TTS HTTP server.
aiohttp_session – aiohttp ClientSession for making HTTP requests.
voice_id –
Piper voice model identifier (e.g. en_US-ryan-high).

Deprecated since version 0.0.105: Use settings=PiperHttpTTSService.Settings(voice=...) instead.
settings – Runtime-updatable settings. When provided alongside deprecated parameters, settings values take precedence.
**kwargs – Additional arguments passed to the parent TTSService.

can_generate_metrics() → bool[source]

Check if this service can generate processing metrics.

Returns:: True, as Piper service supports metrics generation.

async run_tts(text: str, context_id: str) → AsyncGenerator[Frame, None][source]

Generate speech from text using Piper’s HTTP API.

Parameters:

text – The text to convert to speech.
context_id – Unique identifier for this TTS context.

Yields:

Frame – Audio frames containing the synthesized speech and status frames.