stt
Deepgram Flux speech-to-text service for AWS SageMaker (HTTP/2 BiDi transport).
- class pipecat.services.deepgram.flux.sagemaker.stt.DeepgramFluxSageMakerSTTSettings(model: str | None | _NotGiven = <factory>, extra: dict[str, Any]=<factory>, language: Language | str | None | _NotGiven = <factory>, eager_eot_threshold: float | None | _NotGiven = <factory>, eot_threshold: float | None | _NotGiven = <factory>, eot_timeout_ms: int | None | _NotGiven = <factory>, keyterm: list | _NotGiven = <factory>, min_confidence: float | None | _NotGiven = <factory>, language_hints: list[Language] | None | _NotGiven = <factory>)[source]
Bases:
DeepgramFluxSTTSettingsSettings for the Deepgram Flux SageMaker STT service.
Inherits all fields from
DeepgramFluxSTTSettings.
- class pipecat.services.deepgram.flux.sagemaker.stt.DeepgramFluxSageMakerSTTService(*, endpoint_name: str, region: str, encoding: str = 'linear16', sample_rate: int | None = None, mip_opt_out: bool | None = None, tag: list | None = None, should_interrupt: bool = True, settings: DeepgramFluxSageMakerSTTSettings | None = None, **kwargs)[source]
Bases:
DeepgramFluxSTTBaseDeepgram Flux speech-to-text service for AWS SageMaker.
Provides real-time speech recognition using Deepgram Flux models deployed on AWS SageMaker endpoints. Uses HTTP/2 bidirectional streaming for low-latency transcription with advanced turn detection (StartOfTurn, EndOfTurn, EagerEndOfTurn, TurnResumed).
Unlike the Nova-based SageMaker STT service, Flux handles turn detection natively, so no external VAD is needed for turn boundaries. Use
ExternalUserTurnStrategiesin your pipeline.Requirements:
AWS credentials configured (via environment variables, AWS CLI, or instance metadata)
A deployed SageMaker endpoint with Deepgram Flux model
Event handlers available:
on_connected: Called when the SageMaker session is established
on_disconnected: Called when the session is closed
on_connection_error: Called on connection failure
on_start_of_turn: Deepgram Flux detected start of speech
on_end_of_turn: Deepgram Flux detected end of turn
on_eager_end_of_turn: Deepgram Flux predicted end of turn
on_turn_resumed: User resumed speaking after EagerEndOfTurn
on_update: Interim transcript update during a turn
Example:
stt = DeepgramFluxSageMakerSTTService( endpoint_name="my-deepgram-flux-endpoint", region="us-east-2", settings=DeepgramFluxSageMakerSTTService.Settings( model="flux-general-en", eot_threshold=0.7, eager_eot_threshold=0.5, ), )
- Settings
alias of
DeepgramFluxSageMakerSTTSettings
- __init__(*, endpoint_name: str, region: str, encoding: str = 'linear16', sample_rate: int | None = None, mip_opt_out: bool | None = None, tag: list | None = None, should_interrupt: bool = True, settings: DeepgramFluxSageMakerSTTSettings | None = None, **kwargs)[source]
Initialize the Deepgram Flux SageMaker STT service.
- Parameters:
endpoint_name – Name of the SageMaker endpoint with Deepgram Flux model deployed (e.g., “my-deepgram-flux-endpoint”).
region – AWS region where the endpoint is deployed (e.g., “us-east-2”).
encoding – Audio encoding format. Defaults to “linear16”.
sample_rate – Audio sample rate in Hz. If None, uses the pipeline sample rate.
mip_opt_out – Opt out of Deepgram model improvement program.
tag – Tags to label requests for identification during usage reporting.
should_interrupt – Whether to interrupt the bot when Flux detects that the user is speaking. Defaults to True.
settings – Runtime-updatable settings.
**kwargs – Additional arguments passed to the parent STTService.