vision_service
Vision service implementation.
Provides base classes and implementations for computer vision services that can analyze images and generate textual descriptions or answers to questions about visual content.
- class pipecat.services.vision_service.VisionService(*, settings: VisionSettings | None = None, **kwargs)[source]
Bases:
AIServiceBase class for vision services.
Provides common functionality for vision services that process images and generate textual responses. Handles image frame processing and integrates with the AI service infrastructure for metrics and lifecycle management.
- __init__(*, settings: VisionSettings | None = None, **kwargs)[source]
Initialize the vision service.
- Parameters:
settings – The runtime-updatable settings for the vision service.
**kwargs – Additional arguments passed to the parent AIService.
- abstractmethod async run_vision(frame: UserImageRawFrame) AsyncGenerator[Frame, None][source]
Process the given vision image and generate results.
This method must be implemented by subclasses to provide actual computer vision functionality such as image description, object detection, or visual question answering.
- Parameters:
frame – The image frame to process.
- Yields:
Frame – Frames containing the vision analysis results, typically TextFrame objects with descriptions or answers.
- async process_frame(frame: Frame, direction: FrameDirection)[source]
Process frames, handling vision image frames for analysis.
Automatically processes UserImageRawFrame objects by calling run_vision and handles metrics tracking. Other frames are passed through unchanged.
- Parameters:
frame – The frame to process.
direction – The direction of frame processing.