谷歌全新Gemini Pro模型再次创下基准测试纪录。

内容来源:https://techcrunch.com/2026/02/19/googles-new-gemini-pro-model-has-record-benchmark-scores-again/
内容总结:
谷歌于周四正式推出其大语言模型Gemini Pro的最新版本Gemini 3.1 Pro,该模型目前处于预览阶段,即将全面开放。此次升级被业界视为性能的重大飞跃,较去年11月发布的上一代模型有显著提升。
谷歌同步公布了多项第三方基准测试结果,其中在"Humanity's Last Exam"等评估中,新模型表现远超前期版本。人工智能初创公司Mercor首席执行官布伦丹·福迪特别指出,Gemini 3.1 Pro已在其专业任务评估体系APEX的智能体排行榜中位列榜首,这标志着AI智能体在现实知识工作中的能力正快速进化。
当前人工智能模型竞争日趋白热化,科技企业纷纷推出专注于智能体任务与多步推理的大语言模型。除谷歌外,OpenAI、Anthropic等公司近期也相继发布了新一代模型,行业技术迭代持续加速。
中文翻译:
周四,谷歌发布了其强大的大语言模型Gemini Pro的最新版本。该公司表示,这款名为3.1的模型目前处于预览阶段,即将全面发布。
谷歌的新模型可能是目前最强大的大语言模型之一。业内人士指出,Gemini 3.1 Pro相比前代Gemini 3实现了重大飞跃——去年11月发布的Gemini 3当时已被视为性能卓越的人工智能工具。
谷歌周四还公布了来自"人类终极考试"等独立基准测试的数据,显示新版本性能较前代显著提升。
人工智能初创公司Mercor首席执行官布伦丹·福迪也对Gemini 3.1 Pro表示赞赏。该公司开发的APEX基准测试系统专门评估人工智能模型执行实际专业任务的能力。福迪在社交媒体发文称:"Gemini 3.1 Pro现已登顶APEX智能体排行榜",并指出该模型的优异表现证明"智能体在真实知识工作领域的进步速度令人惊叹"。
此次发布正值人工智能模型竞争白热化之际,科技公司持续推出专为智能体工作和多步推理设计的更强大语言模型。包括OpenAI和Anthropic在内的其他行业巨头近期也相继发布了新模型。
英文来源:
On Thursday, Google released the newest version of Gemini Pro, its powerful LLM. The model, 3.1, is currently available as a preview and will be generally released soon, the company said.
Google’s new model may be one of the most powerful LLMs yet. Onlookers have noted that Gemini 3.1 Pro appears to be a big step up from its predecessor, Gemini 3 — which, upon its release in November, was already considered a highly capable AI tool.
On Thursday, Google also shared statistics from independent benchmarks — such as one called Humanity’s Last Exam — that showed it performing significantly better than its previous version.
Gemini 3.1 Pro was also praised by Brendan Foody, the CEO of AI startup Mercor, whose benchmarking system, APEX, is designed to measure how well new AI models perform real professional tasks. “Gemini 3.1 Pro is now at the top of the APEX-Agents leaderboard,” Foody said in a social media post, adding that the model’s impressive results show “how quickly agents are improving at real knowledge work.”
The release comes as the AI model wars are heating up, and tech companies continue to release increasingly powerful LLMs designed for agentic work and multi-step reasoning. Other major names — including OpenAI and Anthropic — have recently released new models as well.
文章标题:谷歌全新Gemini Pro模型再次创下基准测试纪录。
文章链接:https://www.qimuai.cn/?post=3368
本站文章均为原创,未经授权请勿用于任何商业用途