GPT‑5-Codex：Codex 的升级版

本文宣布 OpenAI 的 GPT-5-Codex 处于“预发布”阶段，它是经过微调的 GPT-5 变体，专为 AI 辅助编程而设计。虽然尚未通过 API 提供，但它已经集成到 OpenAI 的 VS Code 扩展、Codex CLI 和新命名的 Codex Cloud 代理中。作者指出，可以将 Codex 视为 OpenAI 旗下编码模型的品牌名称。GPT-5-Codex 的主要功能包括：针对代码审查的专门训练、基于任务复杂性的动态思考时间调整、专有代码重构评估的显著改进（从 33.9% 提高到 51.3%），以及在移动网站创建方面更受用户青睐。此外，它还可以生成更准确和相关的代码注释。文章还提到了一个新的 Codex Cloud 功能，用于自动执行 GitHub 代码审查，并引用了 Theo Browne 的第三方视频评论，他指出 Codex CLI 搜索工具存在一些限制。一项关于 SVG 生成鹈鹕骑自行车的趣味性测试，结果好坏参半。

GPT‑5-Codex and upgrades to Codex. OpenAI half-released a new model today: GPT‑5-Codex, a fine-tuned GPT-5 variant explicitly designed for their various AI-assisted programming tools.

I say half-released because it's not yet available via their API, but they "plan to make GPT‑5-Codex available in the API soon".

I wrote about the confusing array of OpenAI products that share the name Codex a few months ago. This new model adds yet another, though at least "GPT-5-Codex" (using two hyphens) is unambiguous enough not to add to much more to the confusion.

At this point it's best to think of Codex as OpenAI's brand name for their coding family of models and tools.

The new model is already integrated into their VS Code extension, the Codex CLI and their Codex Cloud asynchronous coding agent. I'd been calling that last one "Codex Web" but I think Codex Cloud is a better name since it can also be accessed directly from their iPhone app.

Codex Cloud also a new feature: you can configure it to automatically run code review against specific GitHub repositories (I found that option on chatgpt.com/codex/settings/code-review) and it will create a temporary container to use as part of those reviews. Here's the relevant documentation.

Some documented features of the new GPT-5-Codex model:

Specifically trained for code review, which directly supports their new code review feature.
"GPT‑5-Codex adapts how much time it spends thinking more dynamically based on the complexity of the task." Simple tasks (like "list files in this directory") should run faster. Large, complex tasks should use run for much longer - OpenAI report Codex crunching for seven hours in some cases!
Increased score on their proprietary "code refactoring evaluation" from 33.9% for GPT-5 (high) to 51.3% for GPT-5-Codex (high). It's hard to evaluate this without seeing the details of the eval but it does at least illustrate that refactoring performance is something they've focused on here.
"GPT‑5-Codex also shows significant improvements in human preference evaluations when creating mobile websites" - in the past I've habitually prompted models to "make it mobile-friendly", maybe I don't need to do that any more.
"We find that comments by GPT‑5-Codex are less likely to be incorrect or unimportant" - less unimportant comments in code is definitely an improvement!

Theo Browne has a video review of the model and accompanying features. He was generally impressed but noted that it was surprisingly bad at using the Codex CLI search tool to navigate code. Hopefully that's something that can fix with a system prompt update.

Finally, can it drew a pelican riding a bicycle? Without API access I instead got Codex Cloud to have a go by prompting:

Generate an SVG of a pelican riding a bicycle, save as pelican.svg

Here's the result:

it's a bit messy - the pelican is quite good and the bicycle is quite good but the pelican is stood overlapping the bicycle not riding it.

{{userData.name}}已认证

Dify Knowledge Pipeline 正式发布！

利用 AI 代理提升数据仓库访问安全与效率

淘宝直播 AI 提效探索的一些心得

Qwen3-TTS—它来了它来了！

12 月新登记 8 家私募基金管理人，含 1 家 CVC 丨睿兽分析

推理加速

智能体工程

构建 LangGraph：一种面向生产环境的 Agent 运行时设计