Asset Version: 1.0.0
Last Published: Nov 28, 2025
The Google Gemini Image Understanding Component enables seamless interaction with Google’s multimodal AI capabilities to intelligently analyze and interpret image-based content. This component allows you to upload images (JPEG, PNG..) from either the device camera or the gallery, along with an optional text prompt. The AI model then generates responses based on the visual elements, objects, text, and contextual details present in the image.
Requirements
- HCL Volt MX Iris
- HCL Volt MX Foundry
Features:
- Full-Context Visual Analysis : Understands entire images holistically — including layout, composition, and spatial relationships — to derive context-aware insights beyond basic object or text detection.
- Multi-Modal Intelligence : Processes both visual and textual elements such as charts, diagrams, graphs, infographics, and handwritten content, enabling deeper semantic interpretation.
- Structured Output Generation : Delivers results in JSON, plain text, suitable for integration into analytics, automation, or reporting systems.