Veo 3：Gemini API 新功能上线

本文介绍了 Google 最新的高保真视频生成模型 Veo 3，该模型现已通过 Gemini API 和 Vertex AI 提供付费预览。Veo 3 是首个集成高保真视频输出和原生音频的 Google 视频模型，支持文本生成视频，未来还将支持图像生成视频。它拥有同步声音、电影级质量和逼真物理模拟等功能。本文展示了早期开发者的采用情况，并提供了 Cartwheel（使用 Veo 3 进行 3D 动画）和 Volley（用于游戏内过场动画）的示例。它提供了快速入门指南，包括定价（每秒 0.75 美元）、Python 代码示例以及指向 Google AI Studio 中的文档和入门应用程序的链接。Google AI 订阅者可通过 Gemini 应用和 Flow 访问 Veo 3，企业客户则可通过 Vertex AI 访问。此外，它还强调了负责任的 AI 开发，并指出所有生成的视频都包含数字 SynthID 水印。

Starting today, we're bringing Veo 3 to developers in paid preview via the Gemini API and Vertex AI.

First unveiled at Google I/O 2025, people around the world have already generated tens of millions of high-quality videos with Veo 3 (along with some new fun and interesting video trends). It is our first video model to incorporate high-fidelity video outputs and native audio, first with text-to-video and soon with image-to-video.

Link to Youtube Video (visible only when JS is disabled)

Developers are already experimenting with Veo 3, discovering how the model can help them brainstorm content, rapidly iterate, and be more efficient.

Cartwheel developed a system that can take 2D videos of humans and translate it into fully production ready 3D animation on rigged characters. Cartwheel uses Veo 3 to generate realistic, fluid human actions that Cartwheel can then turn into 3D animations for customers.

Volley uses Veo 3 to produce in-game video cut-scenes that advance the story. With Veo 3, Volley designers can rapidly iterate on the game to deliver the best possible output for an upcoming RPG game called Wit's End.

Veo 3 capabilities

Veo 3 is designed to handle a range of video generation tasks, from cinematic narratives to dynamic character animations. With Veo 3, you can create more immersive experiences by not only generating stunning visuals, but also audio like dialogue and sound effects.

Synchronized Sound: Natively generates rich audio—dialogue, effects, and music—and synchronizes it with video in a single pass.

Cinematic Quality: Produces stunning, high-definition video that captures creative nuances in your prompt, from intricate textures to subtle lighting effects.

Realistic Physics: Simulates real-world physics for authentic motion, from natural character movement to the accurate flow of water and casting of shadows.

Let’s take a look at some examples.

Link to Youtube Video (visible only when JS is disabled)

Prompt: Fluffy Characters Stop Motion: Inside a brightly colored, cozy kitchen made of felt and yarn. Professor Nibbles, a plump, fluffy hamster with oversized glasses, nervously stirs a bubbling pot on a miniature stove, muttering, "Just a little more... 'essence of savory,' as the recipe calls for." The camera is a mid-shot, capturing his frantic stirring. Suddenly, the pot emits a loud "POP!" followed by a comical "whoosh" sound, and a geyser of iridescent green slime erupts, covering the entire kitchen. Professor Nibbles shrieks, "Oh, dear! Not again!" and scurries away, leaving a trail of tiny, panicked squeaks.

Link to Youtube Video (visible only when JS is disabled)

Prompt: The sequence begins with an extreme close-up of a single gear, slowly turning and reflecting harsh sunlight. The camera gradually pulls back in a continuous movement, revealing this is but one component of a colossal, mechanical heart half-buried in a desolate, rust-colored desert. A sweeping aerial shot establishes its enormous scale and isolation in the barren landscape. The camera descends to capture pipes hissing steam and the rhythmic thumping that echoes across the empty plains. A subtle shake effect synchronizes with each massive heartbeat. A lateral tracking shot discovers tiny, robed figures scurrying across the metallic surface. The camera follows one such figure in a detailed tracking shot as they perform meticulous maintenance, polishing brass valves and tightening immense bolts. A complex movement circles the entire structure, capturing different maintenance teams working in precarious positions across its rusted exterior. The final shot begins tight on the meticulous work of one tiny figure before executing a dramatic pull-out that reveals the true scale of the heart and the minuscule size of its caretakers, tending to the vital organ of an unseen, sleeping giant that extends beyond the frame.

Explore these examples and more with Veo 3 in Google AI Studio, available as an SDK template and interactive Starter App to remix, copy and extend. The Starter App and its sample code offer a convenient way for Paid Tier users to rapidly prototype with Veo 3 and more on the Gemini API, directly from Google AI Studio.

Click the Key button in the top right of the AI Studio Build interface to select a Google Cloud Project with billing enabled to use the Paid Tier in AI Studio apps. See the FAQs for more.

Get started with Veo 3 in the Gemini API

Veo 3 will be priced at $0.75 per second for video and audio output. Additionally, Veo 3 Fast will be available soon, offering a faster and more cost-effective option for video creation.

Here’s a basic Python example to create a video:

import time
from google import genai
from google.genai import types

client = genai.Client()

operation = client.models.generate_videos(
    model="veo-3.0-generate-preview",
    prompt="a close-up shot of a golden retriever playing in a field of sunflowers",
    config=types.GenerateVideosConfig(
        negative_prompt="barking, woofing",
    ),
)

# Waiting for the video(s) to be generated
while not operation.done:
    time.sleep(20)
    operation = client.operations.get(operation)

generated_video = operation.result.generated_videos[0]
client.files.download(file=generated_video.video)
generated_video.video.save("veo3_video.mp4")

Python

Building responsibly with Veo 3 in the Gemini API

All videos generated by Veo 3 models will continue to include a digital SynthID watermark. To get started, check out the documentation, cookbook, and a Veo 3 starter app in Google AI Studio:

Read the documentation

Veo cookbook

Try the Veo 3 starter app (paid tier only)

In addition to being available via the Gemini API in Google AI Studio, Veo 3 is also available to Google AI subscribers in the Gemini app and Flow, and to enterprise customers via Vertex AI.

{{userData.name}}已认证

Starting today, we're bringing Veo 3 to developers in paid preview via the Gemini API and Vertex AI.

Veo 3 capabilities

Get started with Veo 3 in the Gemini API

Building responsibly with Veo 3 in the Gemini API

罗永浩对话李想：比热爱更重要的，是正反馈

AI Coding 赛道，Solo 创业、6 个月 8000 万卖掉，独立开发的新传奇

“RAG 已死，上下文工程为王”——Chroma 的 Jeff Huber

Google 的 Nano Banana 如何实现突破性的角色一致性

刚刚，OpenAI 首个 L3 级智能体深夜觉醒！AI 自己玩电脑引爆全网，AGI 一触即发

AI 网关

构建人们喜爱的产品的心智模型 (嘉宾：Stewart Butterfield)

OpenAI 开发者大会：ChatGPT 超级应用、企业级 Agent