Google DeepMind introduced the Gemini 2.5 Computer Use model, a specialized AI designed to control software user interfaces. It lets agents interact with web and mobile UIs by clicking, typing, scrolling, filling forms, and navigating interfaces — just like human users. The model handles these tasks through a loop: it receives a screenshot and context, returns an action choice, executes that action, then updates based on the new UI state. It’s optimized for browser control with low latency and strong performance. It also includes built-in safety controls so that risky or high-stakes actions require confirmation or are blocked. The model is now in public preview via the Gemini API, available in Google AI Studio and Vertex AI.Warmly,

