stt

Deepgram speech-to-text service for AWS SageMaker.

This module provides a Pipecat STT service that connects to Deepgram models deployed on AWS SageMaker endpoints. Uses HTTP/2 bidirectional streaming for low-latency real-time transcription with support for interim results, multiple languages, and various Deepgram features.

Bases: DeepgramSTTSettings

Settings for the Deepgram SageMaker STT service.

Inherits all fields from DeepgramSTTService.Settings.

class pipecat.services.deepgram.sagemaker.stt.DeepgramSageMakerSTTService(*, endpoint_name: str, region: str, encoding: str = 'linear16', channels: int = 1, multichannel: bool = False, sample_rate: int | None = None, mip_opt_out: bool | None = None, live_options: LiveOptions | None = None, settings: DeepgramSageMakerSTTSettings | None = None, ttfs_p99_latency: float | None = 0.35, **kwargs)[source]

Bases: STTService

Deepgram speech-to-text service for AWS SageMaker.

Provides real-time speech recognition using Deepgram models deployed on AWS SageMaker endpoints. Uses HTTP/2 bidirectional streaming for low-latency transcription with support for interim results, speaker diarization, and multiple languages.

Requirements:

AWS credentials configured (via environment variables, AWS CLI, or instance metadata)
A deployed SageMaker endpoint with Deepgram model: https://developers.deepgram.com/docs/deploy-amazon-sagemaker

Example:

stt = DeepgramSageMakerSTTService(
    endpoint_name="my-deepgram-endpoint",
    region="us-east-2",
    settings=DeepgramSageMakerSTTService.Settings(
        model="nova-3",
        language="en",
        interim_results=True,
        punctuate=True,
    ),
)

Settings: alias of DeepgramSageMakerSTTSettings

__init__(*, endpoint_name: str, region: str, encoding: str = 'linear16', channels: int = 1, multichannel: bool = False, sample_rate: int | None = None, mip_opt_out: bool | None = None, live_options: LiveOptions | None = None, settings: DeepgramSageMakerSTTSettings | None = None, ttfs_p99_latency: float | None = 0.35, **kwargs)[source]

Initialize the Deepgram SageMaker STT service.

Parameters:

endpoint_name – Name of the SageMaker endpoint with Deepgram model deployed (e.g., “my-deepgram-nova-3-endpoint”).
region – AWS region where the endpoint is deployed (e.g., “us-east-2”).
encoding – Audio encoding format. Defaults to “linear16”.
channels – Number of audio channels. Defaults to 1.
multichannel – Transcribe each audio channel independently. Defaults to False.
sample_rate – Audio sample rate in Hz. If None, uses the pipeline sample rate.
mip_opt_out – Opt out of Deepgram model improvement program.
live_options –
Legacy configuration options.

Deprecated since version 0.0.105: Use settings=DeepgramSageMakerSTTService.Settings(...) for runtime-updatable fields and direct init parameters for connection-level config.
settings – Runtime-updatable settings. When provided alongside live_options, settings values take precedence (applied after the live_options merge).
ttfs_p99_latency – P99 latency from speech end to final transcript in seconds. Override for your deployment. See https://github.com/pipecat-ai/stt-benchmark
**kwargs – Additional arguments passed to the parent STTService.

can_generate_metrics() → bool[source]

Check if this service can generate processing metrics.

Returns:: True, as Deepgram SageMaker service supports metrics generation.

async start(frame: StartFrame)[source]

Start the Deepgram SageMaker STT service.

Parameters:: frame – The start frame containing initialization parameters.

async stop(frame: EndFrame)[source]

Stop the Deepgram SageMaker STT service.

Parameters:: frame – The end frame.

async cancel(frame: CancelFrame)[source]

Cancel the Deepgram SageMaker STT service.

Parameters:: frame – The cancel frame.

async run_stt(audio: bytes) → AsyncGenerator[Frame | None, None][source]

Send audio data to Deepgram for transcription.

Parameters:: audio – Raw audio bytes to transcribe.
Yields:: Frame – None (transcription results come via BiDi stream callbacks).

async process_frame(frame: Frame, direction: FrameDirection)[source]

Process frames with Deepgram SageMaker-specific handling.

Parameters:

frame – The frame to process.
direction – The direction of frame processing.