«

人工智能产业对规模扩张的痴迷正走向悬崖边缘

qimuai 发布于 阅读:8 一手编译


人工智能产业对规模扩张的痴迷正走向悬崖边缘

内容来源:https://www.wired.com/story/the-ai-industrys-scaling-obsession-is-headed-for-a-cliff/

内容总结:

麻省理工学院最新研究表明,巨型人工智能模型可能即将面临性能增长瓶颈。这项研究通过分析算力扩张规律与模型效率提升曲线指出:未来十年,依赖海量算力的前沿模型性能提升空间将逐渐收窄,而运行在普通硬件上的高效模型反而会展现更强竞争力。

参与该研究的计算机科学家尼尔·汤普森预测:“未来五到十年,这种差距缩小趋势将日益明显。”今年1月DeepSeek发布低成本高性能模型的事件,已经为习惯挥霍算力的AI行业敲响警钟。

尽管目前OpenAI等公司的尖端模型仍遥遥领先于学术机构的轻量模型,但研究负责人汉斯·冈德拉赫指出,当前流行的推理模型尤其依赖额外计算资源,其性能增长放缓趋势更为显著。汤普森建议业界重视算法优化:“如果愿意投入巨资训练模型,就应该分配资源开发更高效的算法。”

该研究发布正值美国AI基建狂潮持续升温之际。OpenAI近期宣布与博通合作开发定制芯片,其总裁格雷格·布罗克曼宣称“世界需要更多算力”。但金融界已响起警钟,摩根大通CEO杰米·戴蒙公开提醒市场“需要保持更高警惕性”。

值得关注的是,当前AI基建热潮不仅涉及技术竞争,还暗含产业格局重塑。OpenAI正通过5000亿美元估值布局自主可控的算力体系,试图降低对微软和英伟达的依赖。然而专家警告,过度专注GPU基建可能导致企业错失深度学习替代方案、新型芯片架构乃至量子计算等学术前沿突破带来的新机遇。

(根据《连线》杂志AI实验室通讯整理)

中文翻译:

麻省理工学院的一项新研究表明,与小型模型相比,规模庞大且计算密集的AI模型可能很快会出现收益递减。通过将扩展定律与模型效率的持续提升进行对标,研究人员发现从巨型模型中榨取性能飞跃将愈发困难,而效率提升可能使运行在普通硬件上的模型在未来十年内变得越来越强大。

参与这项研究的麻省理工学院计算机科学家尼尔·汤普森教授指出:“未来五到十年内,发展轨迹很可能会开始收窄。”效率的飞跃——如一月份深度求索公司推出的低成本模型所展现的那样——已经给习惯于消耗海量计算资源的AI行业敲响了警钟。

就现状而言,像OpenAI这样的公司开发的前沿模型,其性能确实远胜于学术实验室用少量计算资源训练的模型。但麻省理工团队预测,除非出现诸如强化学习等新训练方法带来惊人突破的情况,否则大型AI公司的优势未来将会减弱。

主导这项分析的麻省理工学院研究科学家汉斯·冈拉克因运行尖端模型的笨重性而对此产生兴趣。他与汤普森及另一位研究科学家杰森·林奇共同绘制了前沿模型与采用普通计算资源构建模型的未来性能对比图。冈拉克指出,这种预测趋势在当前流行的推理模型中尤为明显,因为它们更依赖推理过程中的额外计算。

汤普森认为该结果彰显了优化算法与扩大计算规模的双重价值。他补充道:“如果投入巨资训练这些模型,就必须分配部分资源开发更高效的算法,这会产生巨大影响。”

当前AI基础设施建设热潮(或称“泡沫”)毫无消退迹象,使得这项研究尤为值得关注。OpenAI等美国科技公司已签署万亿级协议在美国建设AI基础设施。OpenAI总裁格雷格·布罗克曼本周宣布与博通合作开发定制AI芯片时宣称:“世界需要更多计算资源。”

越来越多专家开始质疑这些协议的合理性。数据中心建设成本约60%用于GPU,而这类硬件贬值速度极快。行业巨头之间的合作也呈现出循环往复且不透明的特征。

摩根大通首席执行官杰米·戴蒙成为最新发出预警的金融界权威,他上周告诉BBC:“大多数人应该意识到不确定性正在加剧。”

这场AI基础设施淘金热并非完全为了打造更强大的模型。OpenAI实质上是赌生成式AI工具的需求将呈指数级增长。该公司可能也在寻求减少对微软和英伟达的依赖,将其5000亿美元的天价估值转化为可自主设计定制的基础设施。

即便如此,行业借鉴麻省理工这类分析来探索算法和硬件未来演进方向仍是明智之举。当前支撑美国经济的建设热潮也可能对创新产生反作用。AI公司过度专注于GPU等深度学习专用芯片的投资,可能会错过来自学术边缘的新机遇——诸如深度学习的替代方案、新型芯片设计乃至量子计算等新思路。毕竟,今日的AI突破正是源自这类探索。

您是否对涌入AI新基建的巨额资金感到担忧?欢迎发送邮件至ailab@wired.com分享观点。本文摘编自威尔·奈特《AI实验室》时事通讯,往期内容请点击此处查阅。

英文来源:

A new study from MIT suggests the biggest and most computationally intensive AI models may soon offer diminishing returns compared to smaller models. By mapping scaling laws against continued improvements in model efficiency, the researchers found that it could become harder to wring leaps in performance from giant models whereas efficiency gains could make models running on more modest hardware increasingly capable over the next decade.
“In the next five to 10 years, things are very likely to start narrowing,” says Neil Thompson, a computer scientist and professor at MIT involved in the study.
Leaps in efficiency, like those seen with DeepSeek’s remarkably low-cost model in January, have already served as a reality check for the AI industry, which is accustomed to burning massive amounts of compute.
As things stand, a frontier model from a company like OpenAI is currently much better than a model trained with a fraction of the compute from an academic lab. While the MIT team’s prediction might not hold if, for example, new training methods like reinforcement learning produce surprising new results, they suggest that big AI firms will have less of an edge in the future.
Hans Gundlach, a research scientist at MIT who led the analysis, became interested in the issue due to the unwieldy nature of running cutting edge models. Together with Thompson and Jayson Lynch, another research scientist at MIT, he mapped out the future performance of frontier models compared to those built with more modest computational means. Gundlach says the predicted trend is especially pronounced for the reasoning models that are now in vogue, which rely more on extra computation during inference.
Thompson says the results show the value of honing an algorithm as well as scaling up compute. “If you are spending a lot of money training these models, then you should absolutely be spending some of it trying to develop more efficient algorithms, because that can matter hugely,” he adds.
The study is particularly interesting given today’s AI infrastructure boom (or should we say “bubble”?)—which shows little sign of slowing down.
OpenAI and other US tech firms have signed hundred-billion-dollar deals to build AI infrastructure in the United States. “The world needs much more compute,” OpenAI’s president, Greg Brockman, proclaimed this week as he announced a partnership between OpenAI and Broadcom for custom AI chips.
A growing number of experts are questioning the soundness of these deals. Roughly 60 percent of the cost of building a data center goes toward GPUs, which tend to depreciate quickly. Partnerships between the major players also appear circular and opaque.
Jamie Dimon, the CEO of JP Morgan, is the latest big name in finance to issue a warning, telling the BBC last week. “The level of uncertainty should be higher in most people’s minds.”
The AI infrastructure gold rush is not entirely about building more capable models. OpenAI is effectively betting that demand for new generative AI tools will grow exponentially. The company may also be looking to decrease its dependence on Microsoft and Nvidia and turn its massive $500 billion valuation into infrastructure that it can design and customize.
Even so, it would seem prudent for the industry to use analysis like the one debuted from MIT to explore how algorithms and hardware may evolve in the next few years.
The building boom now propping up much of the US economy may also have consequences for American innovation. By investing so heavily in GPUs and other chips specialized for deep learning, AI companies might miss new opportunities that could come from exploring ideas from the fringes of academia, like alternatives to deep learning, novel chip designs, and even approaches like quantum computing. That is, after all, where today’s AI breakthroughs came from.
Are you worried about the money being poured into new AI infrastructure ? Send an email to ailab@wired.com to share your thoughts.
This is an edition of Will Knight’s AI Lab newsletter. Read previous newsletters here.

连线杂志AI最前沿

文章目录


    扫描二维码,在手机上阅读