tts
xAI text-to-speech service implementation.
Provides two TTS services against xAI’s voice API:
XAIHttpTTSServiceuses the batch HTTP endpoint athttps://api.x.ai/v1/tts.XAITTSServiceuses the streaming WebSocket endpoint atwss://api.x.ai/v1/tts.
See https://docs.x.ai/developers/rest-api-reference/inference/voice.
- pipecat.services.xai.tts.language_to_xai_language(language: Language) str | None[source]
Convert a Language enum to xAI language code.
- Parameters:
language – The Language enum value to convert.
- Returns:
The corresponding xAI language code, or None if not supported.
- class pipecat.services.xai.tts.XAITTSSettings(model: str | None | _NotGiven = <factory>, extra: dict[str, Any]=<factory>, voice: str | None | _NotGiven = <factory>, language: Language | str | None | _NotGiven = <factory>)[source]
Bases:
TTSSettingsSettings for XAIHttpTTSService.
- class pipecat.services.xai.tts.XAIHttpTTSService(*, api_key: str, base_url: str = 'https://api.x.ai/v1/tts', sample_rate: int | None = None, encoding: str | None = 'pcm', aiohttp_session: ClientSession | None = None, settings: XAITTSSettings | None = None, **kwargs)[source]
Bases:
TTSServicexAI HTTP text-to-speech service.
The service requests raw PCM audio so emitted
TTSAudioRawFrameobjects match Pipecat’s downstream expectations without extra decoding.- Settings
alias of
XAITTSSettings
- __init__(*, api_key: str, base_url: str = 'https://api.x.ai/v1/tts', sample_rate: int | None = None, encoding: str | None = 'pcm', aiohttp_session: ClientSession | None = None, settings: XAITTSSettings | None = None, **kwargs)[source]
Initialize the xAI TTS service.
- Parameters:
api_key – xAI API key for authentication.
base_url – xAI TTS endpoint. Defaults to
https://api.x.ai/v1/tts.sample_rate – Audio sample rate. If None, uses default.
encoding – Output encoding format. Defaults to “pcm”.
aiohttp_session – Optional shared aiohttp session.
settings – Runtime-updatable settings.
**kwargs – Additional keyword arguments passed to
TTSService.
- class pipecat.services.xai.tts.XAIWebsocketTTSSettings(model: str | None | _NotGiven = <factory>, extra: dict[str, Any]=<factory>, voice: str | None | _NotGiven = <factory>, language: Language | str | None | _NotGiven = <factory>)[source]
Bases:
TTSSettingsSettings for XAITTSService (WebSocket streaming).
- class pipecat.services.xai.tts.XAITTSService(*, api_key: str, base_url: str = 'wss://api.x.ai/v1/tts', sample_rate: int | None = None, codec: str = 'pcm', settings: XAIWebsocketTTSSettings | None = None, **kwargs)[source]
Bases:
InterruptibleTTSServicexAI streaming text-to-speech service.
Connects to xAI’s WebSocket TTS endpoint and streams audio chunks back as they are synthesized. Text can be sent incrementally via
text.deltamessages and each utterance is terminated withtext.done. The server responds withaudio.deltachunks followed by anaudio.donemessage.Audio parameters (voice, language, codec, sample rate, bit rate) are passed as query string parameters on the WebSocket URL; changing any of them at runtime reconnects the WebSocket.
- Settings
alias of
XAIWebsocketTTSSettings
- __init__(*, api_key: str, base_url: str = 'wss://api.x.ai/v1/tts', sample_rate: int | None = None, codec: str = 'pcm', settings: XAIWebsocketTTSSettings | None = None, **kwargs)[source]
Initialize the xAI WebSocket TTS service.
- Parameters:
api_key – xAI API key for authentication.
base_url – xAI TTS WebSocket endpoint. Defaults to
wss://api.x.ai/v1/tts.sample_rate – Output audio sample rate in Hz. If None, uses the pipeline default.
codec – Output audio codec. One of
pcm,wav,mulaw,alaw. Defaults topcmso emittedTTSAudioRawFrameobjects need no decoding downstream.settings – Runtime-updatable settings.
**kwargs – Additional arguments passed to parent
InterruptibleTTSService.
- language_to_service_language(language: Language) str | None[source]
Convert a Language enum to xAI language format.
- async start(frame: StartFrame)[source]
Start the xAI WebSocket TTS service.
- async cancel(frame: CancelFrame)[source]
Cancel the xAI WebSocket TTS service.