Cloudflare服务中断并非由网络攻击所致

内容总结:
11月18日,全球网络服务商Cloudflare发生重大服务中断,导致大量网站无法访问。该公司首席执行官马修·普林斯通过官方博客澄清,此次故障并非由网络攻击引起,而是源于数据库系统权限配置变更引发的技术问题。
据普林斯说明,故障源于Bot管理系统使用的特征配置文件生成异常。该系统通过机器学习模型对网络请求进行机器人行为评分,帮助网站管理者识别并拦截恶意爬虫,包括用于人工智能训练的自动化爬虫程序。由于配置文件体积异常变化,触发了核心代理系统的HTTP 5xx错误代码,致使依赖该模块的网络服务大面积瘫痪。
此次事件是Cloudflare自2019年以来最严重的服务中断。普林斯代表团队向用户致歉,并确认技术人员在发现问题后已及时修复系统。值得注意的是,该公司今年7月刚推出"付费爬虫"实验计划,允许网站在获得报酬后开放AI机器人内容抓取权限,而本次故障所涉及的Bot管理系统正是实现该功能的核心组件之一。
中文翻译:
Cloudflare服务中断事件并非由网络攻击引发。该公司首席执行官马修·普林斯就导致大量网站瘫痪的技术故障作出说明。
普林斯在官方博客中承认,11月18日大规模服务中断发生时,公司曾误判是DDoS攻击所致。经过深入排查,技术团队在意识到错误后迅速修复了系统。他在博文中强调:"此次故障直接或间接均非网络攻击或任何恶意活动造成。"真正根源在于数据库系统权限配置变更,进而引发机器人管理系统所用文件异常。
该公司的机器人管理系统通过机器学习模型,对网络爬虫发出的每个请求进行行为评分。客户依据这些评分决定是否允许特定机器人访问网站。其中一项重要应用就是通过拦截人工智能公司的数据采集机器人,防止其利用网站内容训练大语言模型。今年七月,Cloudflare曾推出"付费爬取"实验项目,允许网站在收取费用后向AI机器人开放内容抓取。
普林斯解释说,该系统的预测模型依赖"特征"配置文件来识别自动化请求。这个特征文件每几分钟更新一次,而本次故障正是由于文件生成机制变更导致文件体积异常,继而触发系统错误。"其直接后果是,所有依赖机器人模块的流量在通过核心代理系统时,均返回HTTP 5xx系列错误代码。"普林斯在博客中写道。
此次事件是Cloudflare近年来最严重的服务中断。公司表示自2019年以来从未发生过导致"核心流量大面积中断"的故障。普林斯已代表团队就此事件公开致歉。
(推文内容翻译)
Cloudflare于11月18日因机器人管理功能异常引发服务中断,影响多项云服务。完整事件分析报告详见:https://t.co/7WArlr5ghI
——Cloudflare官方推特 2025年11月18日
英文来源:
Cloudflare outage was not caused by a cyber attack
Matthew Prince, the company's CEO, explained the problem that took major websites offline.
Cloudflare wrongly suspected that the widespread outage that took numerous websites offline on November 18 was caused by a DDoS attack, the company’s CEO has admitted. In his blog post that breaks down what happened, however, Matthew Prince explained that after realizing their mistake, his team was able to fix the issue. “The issue was not caused, directly or indirectly, by a cyber attack or malicious activity of any kind,” he wrote. It was instead caused by a change to its database systems’ permissions, which led to an issue with a file used by its Bot Management system.
The company’s Bot Management system uses a machine learning model to score bots for every request they make when they crawl Cloudflare’s network. Its clients rely on those bot scores to decide whether to allow or to block specific bots from accessing their websites. One the uses of having bot scores is being able to block AI companies’ bots so they can’t use a website’s content to train their LLMs. In July, Cloudflare launched an experiment called “pay per crawl,” which allows website owners to let an AI bot crawl their pages if they get paid for access.
Prince said the model relies on a “feature” configuration file to make a prediction on whether a bot request was automated or not. The feature file is refreshed every few minutes, and a change in the underlying mechanism generating that file caused a change in its size that triggered the error. “As a result, HTTP 5xx error codes were returned by the core proxy system that handles traffic processing for our customers, for any traffic that depended on the bots module,” Prince wrote.
This recent event has been Cloudflare’s worst outage in years. The company said it hasn’t had an outage that has “caused the majority of core traffic to stop flowing through [its] network” since 2019. Prince apologized for the issue on behalf of his team.
On November 18 Cloudflare experienced a service outage, triggered by an issue with a Bot Management feature, impacting multiple Cloudflare services. Here's a detailed breakdown of what happened. https://t.co/7WArlr5ghI
— Cloudflare (@Cloudflare) November 18, 2025
文章标题:Cloudflare服务中断并非由网络攻击所致
文章链接:https://www.qimuai.cn/?post=2120
本站文章均为原创,未经授权请勿用于任何商业用途