Gemini 3.1 Flash-Lite: Built for intelligence at scale

Google is introducing Gemini 3.1 Flash-Lite as its fastest and cheapest model in the Gemini 3 line, aimed at developers who need to handle large amounts of work without spending much per request. It is rolling out in preview through the Gemini API in Google AI Studio and through Vertex AI for enterprise users. The post says the model costs $0.25 per million input tokens and $1.50 per million output tokens, while offering much lower latency and faster output than Gemini 2.5 Flash. Google positions it as a practical option for high-volume jobs like translation, content moderation, image sorting, and other real-time tasks where speed and cost matter most. It also says developers can adjust how much the model “thinks” for each task, letting the same model take on somewhat heavier work like building interfaces, dashboards, simulations, and multi-step business tasks when needed.

More Info

Recent news

Cyber and AI Risks: Top Threats in Aviation 2026

Frontier AI Models Hit Human-Level Performance

Ambient Intelligence in Hospitality 2026

AI Strategy for Hospitality Boards: 2026 Trends

AI Transforms Travel CX: Scaling Generative & Agentic AI

AI Travel Planners Gain Momentum Despite Accuracy Issues

About

Support

Legal