«

Gemini应用终于支持音频文件了。

qimuai 发布于 阅读:3 一手编译


Gemini应用终于支持音频文件了。

内容来源:https://www.theverge.com/ai-artificial-intelligence/774008/gemini-audio-new-languages-notebooklm-reports

内容总结:

谷歌旗下多款Gemini驱动产品于本周一迎来重要升级:Gemini应用正式支持音频文件解析,搜索引擎新增五种语言支持,而智能笔记工具NotebookLM则新增多种文档生成模式。

据谷歌实验室兼Gemini副总裁乔什·伍德沃德在社交媒体平台X上透露,音频文件兼容功能是用户对Gemini应用"呼声最高的需求"。免费用户每日可提交5次音频解析请求,单次音频时长上限为10分钟;付费的AI Pro和AI Ultra用户则可解析长达3小时的音频内容。所有用户均支持一次性上传10个不同格式文件(含ZIP压缩包)。

通过Gemini 2.5与搜索引擎的深度整合,谷歌搜索的AI模式新增印地语、印尼语、日语、韩语和巴西葡萄牙语五种支持语言。官方博客称,此次升级使得更多用户能够用母语进行复杂查询并获取深度网络洞察。

NotebookLM本次更新重点在于文档处理能力提升。基于用户上传的各类文件资料,该工具现可生成包含学习指南、简报、博客文章等80余种语言的定制化报告,还新增闪卡和测验题生成功能。用户可自主调整报告结构、行文风格与语气语调。据公司透露,该功能预计在本周末前实现100%全覆盖。

值得注意的是,虽然Gemini应用刚刚获得音频解析能力,但NotebookLM早已将多格式文件分析作为其核心研究工具特色。近一个月来,谷歌AI产品线持续快速迭代:八月份Gemini开始支持历史对话记忆功能,免费用户同期获准使用Workspace视频生成软件Vids;九月份相册应用升级至Veo 3视频生成系统,免费用户可将静态图片转换为4秒无声短视频。

中文翻译:

谷歌公司于本周一对其Gemini系列产品进行了三项重大更新:Gemini应用现已支持音频文件上传;搜索引擎新增五种语言支持;NotebookLM可生成博客文章、学习指南、测验题库等多种形式的报告。

Gemini应用终于开放音频文件处理功能
此外,搜索引擎新增五种语言支持,NotebookLM能按不同语调和风格生成报告。

谷歌实验室兼Gemini副总裁乔希·伍德沃德在X平台发帖称,音频文件兼容性是用户对Gemini应用"最迫切的需求"。免费版Gemini用户每日最多可处理10分钟音频及5次提示词查询,而AI Pro或AI Ultra用户可上传长达三小时的音频文件。所有版本均支持包括ZIP压缩包内文件在内的10个不同格式文件。

据公司博客透露,依托Gemini 2.5与搜索引擎的整合,谷歌AI搜索模式新增印地语、印尼语、日语、韩语和巴西葡萄牙语五种语言选项:"此次扩展使更多用户能用母语提出复杂问题,同时进行更深入的网络探索。"

基于Gemini技术的NotebookLM软件同步升级,现可根据用户上传的文档、文件及其他媒体内容,以超过80种语言生成多样化报告格式(虽然Gemini应用刚获得音频处理能力,但NotebookLM早已具备该功能,其定位是帮助用户发现多格式文件规律的研究工具)。

标准报告格式包含学习指南、简报文档和博客文章。最新更新还增加了闪卡和测验题库功能。用户可自定义格式并调整报告结构、语调和风格。公司在对X平台帖子的回复中表示,该功能预计"百分之百"在本周末前全面上线。

过去一个月谷歌密集推出AI相关功能:八月Gemini开始自动记忆用户历史对话的细节偏好,同期免费用户开放使用Workspace视频生成软件Vids;九月Photos升级至最新视频生成软件Veo 3,免费用户可将静态图片生成4秒无声视频。

英文来源:

Google made three major updates to its Gemini-powered products on Monday: The Gemini app now accepts audio files; Search can handle five new languages; and NotebookLM creates reports in the form of blog posts, study guides, quizzes, and more.
Gemini app finally expands to audio files
Also, Search can now accept five new languages and NotebookLM can create reports in various tones or styles.
Also, Search can now accept five new languages and NotebookLM can create reports in various tones or styles.
According to a Monday post on X by Josh Woodward, vice president of Google Labs and Gemini, audio file compatibility was the “#1 request” to the Gemini app.
Free Gemini users max out at 10 minutes of audio, and five free prompts each day. AI Pro or AI Ultra users, meanwhile, can upload audio up to three hours in length. All Gemini prompts accommodate up to 10 files across various file formats, including within ZIP files.
Additionally, Google Search’s AI Mode has rolled out five new language options: Hindi, Indonesian, Japanese, Korean, and Brazilian Portuguese, thanks to the integration of Gemini 2.5 with Search, according to a company blog: “With this expansion, more people can now use AI Mode to ask complex questions in their preferred language, while exploring the web more deeply.”
The Gemini-powered NotebookLM software is also getting an update in the form of new report styles in over 80 languages based on a user’s uploaded documents, files, and other media.
(While the Gemini app is just now getting audio, NotebookLM already had that capability, positioning itself as a research tool that helps find patterns in files of different formats.)
The standard report formats include study guides, briefing docs, and blog posts. Flashcards and quizzes are also included in the latest NotebookLM update. Users can choose the self-created format and adjust the structure, tone, and style of the report. The feature “should be 100%” available by the end of this week, according to a company comment to a post on X.
Google has been on a dizzying spree of AI-related features in the past month. Gemini began automatically recalling user details and preferences from past conversations in August, the same month that free users gained access to Workspace’s video generation software Vids. In September, Photos upgraded to the latest video generation software Veo 3 and gave free users the ability to create silent 4-second long videos from their own personal, still pictures.

ThevergeAI大爆炸

文章目录


    扫描二维码,在手机上阅读