Google is introducing Gemini 3.1 Flash-Lite as its fastest and cheapest model in the Gemini 3 line, aimed at developers who need to handle large amounts of work without spending much per request. It is rolling out in preview through the Gemini API in Google AI Studio and through Vertex AI for enterprise users. The post says the model costs $0.25 per million input tokens and $1.50 per million output tokens, while offering much lower latency and faster output than Gemini 2.5 Flash. Google positions it as a practical option for high-volume jobs like translation, content moderation, image sorting, and other real-time tasks where speed and cost matter most. It also says developers can adjust how much the model “thinks” for each task, letting the same model take on somewhat heavier work like building interfaces, dashboards, simulations, and multi-step business tasks when needed.

Recent news