tts

Mistral text-to-speech service implementation.

This module provides integration with Mistral’s Voxtral TTS API for generating speech from text input using HTTP streaming with Server-Sent Events.

Bases: TTSSettings

Settings for MistralTTSService.

Parameters:

model – TTS model identifier.
voice – Voice identifier.
language – Language for speech synthesis.

class pipecat.services.mistral.tts.MistralTTSService(*, api_key: str | None = None, sample_rate: int | None = None, settings: MistralTTSSettings | None = None, **kwargs)[source]

Bases: TTSService

Mistral Text-to-Speech service using the Voxtral TTS API.

This service uses Mistral’s streaming TTS API to generate PCM-encoded audio at 24kHz. The API returns base64-encoded float32 PCM chunks via Server-Sent Events, which are converted to int16 for the Pipecat pipeline.

Settings: alias of MistralTTSSettings

MISTRAL_SAMPLE_RATE = 24000

__init__(*, api_key: str | None = None, sample_rate: int | None = None, settings: MistralTTSSettings | None = None, **kwargs)[source]

Initialize Mistral TTS service.

Parameters:

api_key – Mistral API key for authentication.
sample_rate – Output audio sample rate in Hz. Audio is resampled from Mistral’s native 24kHz when a different rate is requested.
settings – Runtime-updatable settings.
**kwargs – Additional keyword arguments passed to TTSService.

can_generate_metrics() → bool[source]

Check if this service can generate processing metrics.

Returns:: True, as Mistral TTS service supports metrics generation.

async run_tts(text: str, context_id: str) → AsyncGenerator[Frame, None][source]

Generate speech from text using Mistral’s TTS API.

Parameters:

text – The text to synthesize into speech.
context_id – The context ID for tracking audio frames.

Yields:

Frame – Audio frames containing the synthesized speech data.