Gemini应用终可支持音频文件。

qimuai 发布于 2025-9-9 09:01 阅读：208 一手编译

Gemini应用终可支持音频文件。

内容来源：https://www.theverge.com/ai-artificial-intelligence/774008/gemini-audio-new-languages-notebooklm-reports

内容总结：

谷歌旗下多款Gemini产品于本周一迎来重要升级：Gemini应用正式支持音频文件解析，搜索引擎新增五种语言支持，而NotebookLM则新增多种文档生成模式。

据谷歌实验室兼Gemini副总裁乔什·伍德沃德在社交平台X上透露，音频文件支持功能是用户对Gemini应用"呼声最高的需求"。免费用户每日可提交5次音频解析请求，单次音频时长不超过10分钟；付费的AI Pro和AI Ultra用户则可解析长达3小时的音频文件。所有用户均支持同时上传10个不同格式文件（含ZIP压缩包）。

通过Gemini 2.5与搜索引擎的深度整合，谷歌搜索的AI模式新增印地语、印尼语、日语、韩语和巴西葡萄牙语五种支持语言。官方博客称，此次升级使得更多用户能够使用母语进行复杂查询并获取更深入的网络探索体验。

作为专注于研究辅助的AI工具，NotebookLM本次新增超过80种语言的文档生成功能。用户上传资料后，可生成学习指南、简报文档、博客文章等标准格式报告，还能创建闪卡和测验题。系统支持自由调整报告结构、语气和风格，该功能预计在本周末前完成全球部署。

值得注意的是，这是谷歌近一个月来密集推出的又一轮AI功能更新。八月期间，Gemini开始自动记忆用户历史对话偏好，免费用户同期获准使用Workspace的视频生成软件Vids；九月则迎来Photos应用的Veo 3视频生成技术升级，支持用户将静态图片转换为4秒无声短片。

中文翻译：

谷歌于本周一对其Gemini系列产品进行了三项重大更新：Gemini应用现已支持音频文件处理；搜索引擎新增五种语言支持；NotebookLM可生成博客文章、学习指南、测验题库等多种形式的报告。

Gemini应用终于支持音频文件处理
此外，搜索引擎新增五种语言支持，NotebookLM还能生成不同语气和风格的报告。

谷歌实验室兼Gemini副总裁乔什·伍德沃德在X平台发文称，音频文件兼容性是Gemini应用"呼声最高的需求"。免费版用户每日最多可处理10分钟音频并使用5次提示词，而AI Pro或AI Ultra用户可上传长达三小时的音频文件。所有版本均支持包括ZIP压缩包内文件在内的10个不同格式文件。

据公司博客透露，依托Gemini 2.5与搜索引擎的整合，谷歌AI搜索模式新增印地语、印尼语、日语、韩语和巴西葡萄牙语五种语言选项："此次扩展使更多用户能用母语提出复杂问题，同时进行更深入的网络探索。"

基于Gemini技术的NotebookLM软件也迎来升级，可根据用户上传的文档和媒体材料，以80余种语言生成多样化报告格式（虽然Gemini应用刚获得音频处理功能，但NotebookLM早已具备此能力，其定位是能跨格式文件发现规律的研究工具）。

标准报告格式包含学习指南、简报文档和博客文章。最新更新还增加了闪卡和测验题库功能。用户可自定义格式并调整报告结构、语气和风格。公司在对X平台帖子的回复中表示，该功能"预计本周内实现100%全覆盖"。

过去一个月谷歌密集推出AI新功能：8月Gemini开始自动记忆用户历史对话的细节偏好，同期免费用户开放使用Workspace视频生成软件Vids；9月Photos升级至最新视频生成器Veo 3，免费用户可将静态图片生成4秒无声短视频。

英文来源：

Google made three major updates to its Gemini-powered products on Monday: The Gemini app now accepts audio files; Search can handle five new languages; and NotebookLM creates reports in the form of blog posts, study guides, quizzes, and more.
Gemini app finally expands to audio files
Also, Search can now accept five new languages and NotebookLM can create reports in various tones or styles.
Also, Search can now accept five new languages and NotebookLM can create reports in various tones or styles.
According to a Monday post on X by Josh Woodward, vice president of Google Labs and Gemini, audio file compatibility was the “#1 request” to the Gemini app.
Free Gemini users max out at 10 minutes of audio, and five free prompts each day. AI Pro or AI Ultra users, meanwhile, can upload audio up to three hours in length. All Gemini prompts accommodate up to 10 files across various file formats, including within ZIP files.
Additionally, Google Search’s AI Mode has rolled out five new language options: Hindi, Indonesian, Japanese, Korean, and Brazilian Portuguese, thanks to the integration of Gemini 2.5 with Search, according to a company blog: “With this expansion, more people can now use AI Mode to ask complex questions in their preferred language, while exploring the web more deeply.”
The Gemini-powered NotebookLM software is also getting an update in the form of new report styles in over 80 languages based on a user’s uploaded documents, files, and other media.
(While the Gemini app is just now getting audio, NotebookLM already had that capability, positioning itself as a research tool that helps find patterns in files of different formats.)
The standard report formats include study guides, briefing docs, and blog posts. Flashcards and quizzes are also included in the latest NotebookLM update. Users can choose the self-created format and adjust the structure, tone, and style of the report. The feature “should be 100%” available by the end of this week, according to a company comment to a post on X.
Google has been on a dizzying spree of AI-related features in the past month. Gemini began automatically recalling user details and preferences from past conversations in August, the same month that free users gained access to Workspace’s video generation software Vids. In September, Photos upgraded to the latest video generation software Veo 3 and gave free users the ability to create silent 4-second long videos from their own personal, still pictures.

ThevergeAI大爆炸

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读