llm
Anthropic AI service integration for Pipecat.
This module provides LLM services and context management for Anthropic’s Claude models, including support for function calling, vision, and prompt caching features.
- class pipecat.services.anthropic.llm.AnthropicThinkingConfig(*, type: Literal['enabled', 'disabled'] | str, budget_tokens: int | None = None)[source]
Bases:
BaseModelConfiguration for extended thinking.
- Parameters:
type – Type of thinking mode (currently only “enabled” or “disabled”).
budget_tokens – Maximum number of tokens for thinking. With today’s models, the minimum is 1024. Currently required when type is “enabled”, not allowed when “disabled”.
- type: Literal['enabled', 'disabled'] | str
- budget_tokens: int | None
- class pipecat.services.anthropic.llm.AnthropicLLMSettings(model: str | None | _NotGiven = <factory>, extra: dict[str, Any]=<factory>, system_instruction: str | None | _NotGiven = <factory>, temperature: float | None | _NotGiven | NotGiven = <factory>, max_tokens: int | None | _NotGiven = <factory>, top_p: float | None | _NotGiven | NotGiven = <factory>, top_k: int | None | _NotGiven | NotGiven = <factory>, frequency_penalty: float | None | _NotGiven = <factory>, presence_penalty: float | None | _NotGiven = <factory>, seed: int | None | _NotGiven = <factory>, filter_incomplete_user_turns: bool | None | _NotGiven = <factory>, user_turn_completion_config: UserTurnCompletionConfig | None | _NotGiven = <factory>, enable_prompt_caching: bool | _NotGiven = <factory>, thinking: AnthropicLLMService.ThinkingConfig | _NotGiven | NotGiven = <factory>)[source]
Bases:
LLMSettingsSettings for AnthropicLLMService.
- Parameters:
enable_prompt_caching – Whether to enable prompt caching.
thinking – Extended thinking configuration.
- enable_prompt_caching: bool | _NotGiven
- temperature: float | None | _NotGiven | NotGiven
- top_k: int | None | _NotGiven | NotGiven
- top_p: float | None | _NotGiven | NotGiven
- thinking: AnthropicLLMService.ThinkingConfig | _NotGiven | NotGiven
- classmethod from_mapping(settings)[source]
Convert a plain dict to settings, coercing thinking dicts.
For backward compatibility, a
thinkingvalue that is a plain dict is converted to aAnthropicLLMService.ThinkingConfig.
- class pipecat.services.anthropic.llm.AnthropicLLMService(*, api_key: str, model: str | None = None, params: InputParams | None = None, settings: AnthropicLLMSettings | None = None, client=None, retry_timeout_secs: float | None = 5.0, retry_on_timeout: bool | None = False, **kwargs)[source]
Bases:
LLMServiceLLM service for Anthropic’s Claude models.
Provides inference capabilities with Claude models including support for function calling, vision processing, streaming responses, and prompt caching. Can use custom clients like AsyncAnthropicBedrock and AsyncAnthropicVertex.
- Settings
alias of
AnthropicLLMSettings
- adapter_class
alias of
AnthropicLLMAdapter
- ThinkingConfig
alias of
AnthropicThinkingConfig
- class InputParams(**data: Any)[source]
Bases:
BaseModelInput parameters for Anthropic model inference.
Deprecated since version 0.0.105: Use
AnthropicLLMService.Settingsinstead. Pass settings directly via thesettingsparameter ofAnthropicLLMService.- Parameters:
enable_prompt_caching – Whether to enable the prompt caching feature.
max_tokens – Maximum tokens to generate. Must be at least 1.
temperature – Sampling temperature between 0.0 and 1.0.
top_k – Top-k sampling parameter.
top_p – Top-p sampling parameter between 0.0 and 1.0.
thinking – Extended thinking configuration. Enabling extended thinking causes the model to spend more time “thinking” before responding. It also causes this service to emit LLMThinking*Frames during response generation. Extended thinking is disabled by default.
extra – Additional parameters to pass to the API.
- enable_prompt_caching: bool | None
- max_tokens: int | None
- temperature: float | None
- top_k: int | None
- top_p: float | None
- thinking: AnthropicLLMService.ThinkingConfig | None
- extra: dict[str, Any] | None
- __init__(*, api_key: str, model: str | None = None, params: InputParams | None = None, settings: AnthropicLLMSettings | None = None, client=None, retry_timeout_secs: float | None = 5.0, retry_on_timeout: bool | None = False, **kwargs)[source]
Initialize the Anthropic LLM service.
- Parameters:
api_key – Anthropic API key for authentication.
model –
Model name to use.
Deprecated since version 0.0.105: Use
settings=AnthropicLLMService.Settings(model=...)instead.params –
Optional model parameters for inference.
Deprecated since version 0.0.105: Use
settings=AnthropicLLMService.Settings(...)instead.settings – Runtime-updatable settings for this service. When both deprecated parameters and settings are provided, settings values take precedence.
client – Optional custom Anthropic client instance.
retry_timeout_secs – Request timeout in seconds for retry logic.
retry_on_timeout – Whether to retry the request once if it times out.
**kwargs – Additional arguments passed to parent LLMService.
- can_generate_metrics() bool[source]
Check if this service can generate usage metrics.
- Returns:
True, as Anthropic provides detailed token usage metrics.
- async run_inference(context: LLMContext, max_tokens: int | None = None, system_instruction: str | None = None) str | None[source]
Run a one-shot, out-of-band (i.e. out-of-pipeline) inference with the given LLM context.
- Parameters:
context – The LLM context containing conversation history.
max_tokens – Optional maximum number of tokens to generate. If provided, overrides the service’s default max_tokens setting.
system_instruction – Optional system instruction to use for this inference. If provided, overrides any system instruction in the context.
- Returns:
The LLM’s response as a string, or None if no response is generated.
- async process_frame(frame: Frame, direction: FrameDirection)[source]
Process incoming frames and route them appropriately.
Handles various frame types including context frames, message frames, vision frames, and settings updates.
- Parameters:
frame – The frame to process.
direction – The direction of frame processing.