Gemini 2.5：我们的思维模型系列更新

本文详细介绍了 Google Gemini 2.5 模型系列的最新更新。文章宣布 Gemini 2.5 Pro 和 Gemini 2.5 Flash 已全面可用且稳定，并指出与最近的预览版相比没有变化。新模型 Gemini 2.5 Flash-Lite 以预览版形式推出，提供最低延迟和成本，专为分类和摘要等高吞吐量任务设计。文章解释了将 Gemini 2.5 模型视为具有可调思维预算的“思维模型”的概念。文章还概述了 Gemini 2.5 Flash 的更新定价，并强调了 Gemini 2.5 Pro 的巨大需求和使用量，特别是在编码和智能体任务方面，展示了其与流行开发者工具的集成。文章提供了较旧预览模型的弃用日期，以指导用户迁移。

Today we are excited to share updates across the board to our Gemini 2.5 model family:

Gemini 2.5 Pro is generally available and stable (no changes from the 06-05 preview)

Gemini 2.5 Flash is generally available and stable (no changes from the 05-20 preview, see pricing updates below)

Gemini 2.5 Flash-Lite is now available in preview

Gemini 2.5 models are thinking models, capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy. Each model has control over the thinking budget, giving developers the ability to choose when and how much the model “thinks” before generating a response.

Overview of our family of Gemini 2.5 thinking models

Introducing Gemini 2.5 Flash-Lite

Today, we’re introducing 2.5 Flash-Lite in preview with the lowest latency and cost in the 2.5 model family. It’s designed as a cost-effective upgrade from our previous 1.5 and 2.0 Flash models. It also offers better performance across most evals, and lower time to first token while also achieving higher tokens per second decode. This model is great for high throughput tasks like classification or summarization at scale.

Gemini 2.5 Flash-Lite is a reasoning model, which allows for dynamic control of the thinking budget with an API parameter. Because Flash-Lite is optimized for cost and speed, “thinking” is off by default, unlike our other models. 2.5 Flash-Lite also supports all of our native tools like Grounding with Google Search, Code Execution, and URL Context in addition to function calling.

Benchmarks for Gemini 2.5 Flash-Lite

Updates to Gemini 2.5 Flash and pricing

Over the last year, our research teams have continued to push the pareto frontier with our Flash model series. When 2.5 Flash was initially announced, we had not yet finalized the capabilities for 2.5 Flash-Lite. We also launched with a “thinking” and “non-thinking price”, which led to developer confusion.

With the stable version of Gemini 2.5 Flash rolling out (which is the same 05-20 model preview we made available at Google I/O), and the incredible performance of 2.5 Flash, we are updating the pricing for 2.5 Flash:

$0.30 / 1M input tokens (*up from $0.15 input)

$2.50 / 1M output tokens (*down from $3.50 output)

We removed the thinking vs. non-thinking price difference

We kept a single price tier regardless of input token size

While we strive to maintain consistent pricing between preview and stable releases to minimize disruption, this is a specific adjustment reflecting Flash’s exceptional value, still offering the best cost-per-intelligence available.

And with Gemini 2.5 Flash-Lite, we now have an even lower cost option (with or without thinking) for cost and latency sensitive use cases that require less model intelligence.

Pricing updates for our Gemini Flash family

If you are using the Gemini 2.5 Flash Preview 04-17 , the existing preview pricing will remain in effect until its planned deprecation on July 15, 2025, at which point that model endpoint will be turned off. You can transition to the generally available model “gemini-2.5-flash”, or switch to 2.5 Flash-Lite Preview as a lower cost option.

Continued growth of Gemini 2.5 Pro

The growth and demand for Gemini 2.5 Pro continues to be the steepest of any of our models we have ever seen. To allow more customers to build on this model in production, we are making the 06-05 version of the model stable, with the same pareto frontier price point as before.

We expect that cases where you need the highest intelligence and most capabilities are where you will see Pro shine, like coding and agentic tasks. Gemini 2.5 Pro is at the heart of many of the most loved developer tools.

Top developer tools using Gemini 2.5 Pro

If you are using 2.5 Pro Preview 05-06, the model will remain available until June 19, 2025 and then will be turned off. If you are using 2.5 Pro Preview 06-05, you can simply update your model string to “gemini-2.5-pro”.

We can’t wait to see even more domains benefit from the intelligence of 2.5 Pro and look forward to sharing more about scaling beyond Pro in the near future.

{{userData.name}}已认证

Gemini 2.5：我们的思维模型系列更新

Introducing Gemini 2.5 Flash-Lite

Updates to Gemini 2.5 Flash and pricing

Continued growth of Gemini 2.5 Pro

Andrej Karpathy：AGI 仍需十年，长期挑战犹存

本地搭建“类ChatGPT”AI对话系统：图文详解指南

谷歌开放世界模型一夜刷屏，AI 游戏门槛归零时刻来了？

别为 Redis 和 RabbitMQ 付费了！Postgres 几乎能处理它们的一切……

整体工程学：为复杂演进系统提供有机问题解决方案

AI 编程上瘾指南，一天不用浑身难受

当阿里入局全球 AI Coding，战场里的 60 天 | 对话叔同：Qoder 创始人

刚刚，GPT-4o 原生图像生成上线，P 图、生图也就一嘴的事