博弈论揭示算法如何推高价格

内容来源:https://www.wired.com/story/game-theory-explains-how-algorithms-can-drive-up-prices/
内容总结:
【前沿科技挑战反垄断法规:算法“隐性合谋”推高物价,监管陷困境】
长期以来,反垄断监管的核心在于打击商家通过密室协商操纵价格的行为。然而最新研究表明,在人工智能定价算法日益普及的今天,即便没有人为干预,算法也可能通过自主学习达成“隐性合谋”,导致消费者面临不合理的高物价。
算法博弈的隐形威胁
2019年一项突破性研究首次揭示,两个简单算法在模拟市场中反复博弈后,会自主形成“价格战威慑机制”:当一方降价时,另一方会实施更大幅度的降价报复。这种基于试错学习形成的策略,最终使双方维持高价均衡,其效果堪比人类通过威胁达成的价格同盟。
“无害”算法的定价陷阱
宾夕法尼亚大学计算机科学家亚伦·罗斯团队的最新研究发现,即便使用理论上能保证公平竞争的“无交换遗憾”算法,仍可能产生损害消费者利益的结果。当这类算法遭遇采用固定概率定价的“非响应式策略”时,系统会自发形成持续高价状态。更令人担忧的是,这种状态符合博弈论中的均衡条件,参与者缺乏改变策略的动力。
监管面临认知鸿沟
莱斯大学经济学家马莱什·派指出:“当不存在明确威胁或协议时,监管机构很难认定价格异常。”这种困境凸显出现行法律框架在应对算法定价时的局限性。西北大学学者杰森·哈特莱恩建议强制推行经过验证的公平算法,但反对者认为这种做法无法覆盖人机博弈的复杂场景。
目前学界对“合理定价”仍缺乏统一定义,而现实中的电商平台已广泛采用各类定价算法。随着相关研究持续深入,如何在技术创新与市场公平之间取得平衡,已成为数字时代反垄断领域亟待解决的核心命题。
(本文源引自《量子杂志》,原标题《当定价算法学会不为人知的把戏》)
中文翻译:
本文原载于《量子杂志》。
设想一个小镇上有两位小商品商人。顾客偏爱低价商品,因此商人必须相互竞争以提供最低价格。由于不满微薄利润,两人某夜在一家烟雾缭绕的酒馆密谋:若共同提价而非竞争,双方都能获利更多。但这种蓄意操纵价格的行为——即垄断联盟——早已被法律禁止。商人最终未敢冒险,民众得以继续享受廉价商品。
一个多世纪以来,美国法律始终遵循这一基本原则:禁止幕后交易以维持公平价格。但如今情况日趋复杂。在经济众多领域,卖家日益依赖名为“学习算法”的计算机程序,这些程序会根据市场数据持续调整定价。它们虽比驱动现代人工智能的“深度学习”算法简单,却仍可能产生意外行为。
监管者如何确保算法定价的公平性?传统监管手段因依赖发现明确垄断协议而失效。“算法显然不会相约喝酒谈价,”宾夕法尼亚大学计算机科学家亚伦·罗斯指出。
然而2019年一篇备受关注的论文表明,算法即使未被编程合作,也能学会隐性串谋。研究团队将两个简单学习算法置于模拟市场中对抗,任其探索增收策略。最终每个算法都通过试错学会报复性降价——当对方降价时以更大幅度跟跌。结局是高价格局在相互价格战威胁下形成。
这种隐性威胁也是人类垄断的常见基础。那么为何不直接要求卖家使用本质上无法传递威胁的算法?罗斯与四位合作者的新论文揭示了此路不通:即便看似良性的利润优化算法,仍可能损害买家利益。“高价现象仍会以看似合理的方式出现,”论文合著者纳塔莉·科利纳解释道。
学界对这项发现的解读尚未统一——关键分歧在于对“合理”的定义。但这揭示了算法定价问题的复杂性及监管难度。“若不存在威胁或协议的明确证据,监管者很难指认价格异常,”莱斯大学经济学家马莱什·派表示,“这正是该研究的重要意义。”
无悔之境
该研究通过博弈论视角分析算法定价。这个横跨经济学与计算机科学的领域,致力于在受控环境中探索定价算法的失效机制。“我们试图在实验室构建垄断模型,”未参与该研究的宾大经济学家约瑟夫·哈林顿说,“继而寻找破解之道。”
以石头剪刀布游戏为例:学习算法即玩家根据历史数据制定每轮策略。若玩家足够聪明,最终会达到博弈论中的“均衡状态”——此时任何单方改变策略都无法获益。该游戏中最优解是随机出拳。若对手策略有偏(如地质学家总出石头),基于历史数据调整策略就能提高胜率。博弈论将这种策略优化空间称为“遗憾值”。
现有“无交换遗憾”算法能确保玩家永不后悔策略选择。2000年博弈论证明:两个此类算法对弈必达特定均衡态——其效果等同于单轮博弈的最优解。这一特性意义重大,因为单轮博弈中威胁机制无效。
2024年西北大学计算机科学家杰森·哈特林团队将经典理论应用于竞争市场模型,证明对决的“无交换遗憾”算法在均衡态总会产生竞争性价格,彻底杜绝垄断可能。
定价陷阱
但现实市场中还存在其他看似温和的策略。科利纳与同事埃什瓦尔·阿鲁纳查莱斯瓦兰发现:对抗“无交换遗憾”算法的最佳策略,是赋予超高定价惊人概率,辅以广泛低价的较低概率。这种“无响应策略”虽不传递威胁,却能诱导学习算法提价,并通过偶尔低价收割利润。
最初研究者认为这种情境脱离实际——使用“无交换遗憾”算法的玩家发现利益受损后理应切换策略。但深入研究发现:此时双方已处于均衡态。只要无人改变算法,双方利润即达峰值且近乎均等。任何一方均无动机改变策略,消费者则被迫承受高价。更关键的是,多种概率选择都能在与“无交换遗憾”算法对抗时推高价格——这效果堪比垄断,却无任何串谋痕迹。
监管困境
罗斯坦言尚无解决方案。禁用“无交换遗憾”算法并非良策:若全员使用该算法,价格自会走低。但对于亚马逊等电商平台的卖家而言,简单的无响应策略仍是自然选择,尽管存在遗憾风险。“犯蠢也会导致遗憾,”罗斯调侃道,“但历史上愚蠢从未违法。”
哈特林则认为解法很简单:除博弈论首选的“无交换遗憾”算法外,禁用其他定价算法。其团队已开发出无需查看代码即可验证算法属性的方法。他承认该方案无法杜绝所有不良结果,但强调罗斯论文中的场景本质并非算法垄断:“垄断是双向行为,必须存在单方退出合作的可能性。”
无论如何,这项研究仍留下诸多待解之谜。“我们的认知还远远不足,”派总结道,“这是时代赋予我们的重要课题。”
本文经《量子杂志》授权转载,该刊由西蒙斯基金会创办,致力于通过报道数学、物理与生命科学的研究进展提升公众科学素养。
英文来源:
The original version of this story appeared in Quanta Magazine.
Imagine a town with two widget merchants. Customers prefer cheaper widgets, so the merchants must compete to set the lowest price. Unhappy with their meager profits, they meet one night in a smoke-filled tavern to discuss a secret plan: If they raise prices together instead of competing, they can both make more money. But that kind of intentional price-fixing, called collusion, has long been illegal. The widget merchants decide not to risk it, and everyone else gets to enjoy cheap widgets.
For well over a century, US law has followed this basic template: Ban those backroom deals, and fair prices should be maintained. These days, it’s not so simple. Across broad swaths of the economy, sellers increasingly rely on computer programs called learning algorithms, which repeatedly adjust prices in response to new data about the state of the market. These are often much simpler than the “deep learning” algorithms that power modern artificial intelligence, but they can still be prone to unexpected behavior.
So how can regulators ensure that algorithms set fair prices? Their traditional approach won’t work, as it relies on finding explicit collusion. “The algorithms definitely are not having drinks with each other,” said Aaron Roth, a computer scientist at the University of Pennsylvania.
Yet a widely cited 2019 paper showed that algorithms could learn to collude tacitly, even when they weren’t programmed to do so. A team of researchers pitted two copies of a simple learning algorithm against each other in a simulated market, then let them explore different strategies for increasing their profits. Over time, each algorithm learned through trial and error to retaliate when the other cut prices—dropping its own price by some huge, disproportionate amount. The end result was high prices, backed up by mutual threat of a price war.
Implicit threats like this also underpin many cases of human collusion. So if you want to guarantee fair prices, why not just require sellers to use algorithms that are inherently incapable of expressing threats?
In a recent paper, Roth and four other computer scientists showed why this may not be enough. They proved that even seemingly benign algorithms that optimize for their own profit can sometimes yield bad outcomes for buyers. “You can still get high prices in ways that kind of look reasonable from the outside,” said Natalie Collina, a graduate student working with Roth who co-authored the new study.
Researchers don’t all agree on the implications of the finding—a lot hinges on how you define “reasonable.” But it reveals how subtle the questions around algorithmic pricing can get, and how hard it may be to regulate.
“Without some notion of a threat or an agreement, it’s very hard for a regulator to come in and say, ‘These prices feel wrong,’” said Mallesh Pai, an economist at Rice University. “That’s one reason why I think this paper is important.”
No Regrets
The recent paper studies algorithmic pricing through the lens of game theory, an interdisciplinary field at the border of economics and computer science that analyzes the mathematics of strategic competitions. It’s one way to explore the failures of pricing algorithms in a controlled setting.
“What we’re trying to do is create collusion in the lab,” said Joseph Harrington, a University of Pennsylvania economist who wrote an influential review paper on regulating algorithmic collusion and was not involved in the new research. “Once we do so, we want to figure out how to destroy collusion.”
To understand the key ideas, it helps to start with the simple game of rock-paper-scissors. A learning algorithm, in this context, can be any strategy that a player uses to choose a move in each round based on data from previous rounds. Players might try out different strategies over the course of the game. But if they’re playing well, they’ll ultimately converge to a state that game theorists call equilibrium. In equilibrium, each player’s strategy is the best possible response to the other’s strategy, so neither player has an incentive to change.
In rock-paper-scissors, the ideal strategy is simple: You should play a random move each round, choosing all three possibilities equally often. Learning algorithms shine if one player takes a different approach. In that case, choosing moves based on previous rounds can help the other player win more often than if they just played randomly.
Suppose, for instance, that after many rounds you realize that your opponent, a geologist, chose rock more than 50 percent of the time. If you’d played paper every round, you would have won more often. Game theorists refer to this painful realization as regret.
Researchers have devised simple learning algorithms that are always guaranteed to leave you with zero regret. Slightly more sophisticated learning algorithms called “no-swap-regret” algorithms also guarantee that whatever your opponent did, you couldn’t have done better by swapping all instances of any move with any other move (say, by playing paper every time you actually played scissors). In 2000, game theorists proved that if you pit two no-swap-regret algorithms against each other in any game, they’ll end up in a specific kind of equilibrium—one that would be the optimal equilibrium if they only played a single round. That’s an attractive property, because single-round games are much simpler than multi-round ones. In particular, threats don’t work because players can’t follow through.
In a 2024 paper, Jason Hartline, a computer scientist at Northwestern University, and two graduate students translated the classic results from the 2000 paper to a model of a competitive market, where players can set new prices every round. In that context, the results implied that dueling no-swap-regret algorithms would always end up with competitive prices when they reached equilibrium. Collusion was impossible.
However, no-swap-regret algorithms aren’t the only pricing game strategies in the world of online marketplaces. So what happens when a no-swap-regret algorithm faces a different benign-looking opponent?
The Price Is Wrong
According to game theorists, the best strategy to play against a no-swap-regret algorithm is simple: Start with a specific probability for each possible move, and then choose one move at random every round, no matter what your opponent does. The ideal assignment of probabilities for this “nonresponsive” approach depends on the specific game you’re playing.
In the summer of 2024, Collina and her colleague Eshwar Arunachaleswaran set out to find those optimal probabilities for a two-player pricing game. They found that the best strategy assigned strikingly high probabilities to very high prices, along with lower probabilities for a wide range of lower prices. If you’re playing against a no-swap-regret algorithm, this strange strategy will maximize your profit. “To me, it was a complete surprise,” Arunachaleswaran said.
Nonresponsive strategies look superficially innocuous. They can’t convey threats, because they don’t react to their opponents’ moves at all. But they can coax learning algorithms to raise their prices, and then reap profits by occasionally undercutting their competitors.
At first, Collina and Arunachaleswaran thought that this artificial scenario wasn’t relevant to the real world. Surely the player using the no-swap-regret algorithm would switch to a different algorithm after realizing that their competitor was profiting at their expense.
But as they studied the problem further and discussed it with Roth and two other colleagues, they realized their intuition was wrong. The two players in their scenario were already in a state of equilibrium. Their profits were nearly equal, and both were as high as possible as long as neither player switched to a different algorithm. Neither player would have an incentive to change strategy, so buyers would be stuck with high prices. What’s more, the precise probabilities weren’t that important. Many different choices led to high prices when pitted against a no-swap-regret algorithm. It’s an outcome you’d expect from collusion, but without any collusive behavior in sight.
It Pays to Be Dumb
So, what can regulators do? Roth admits he doesn’t have an answer. It wouldn’t make sense to ban no-swap-regret algorithms: If everyone uses one, prices will fall. But a simple nonresponsive strategy might be a natural choice for a seller on an online marketplace like Amazon, even if it carries the risk of regret.
“One way to have regret is just to be kind of dumb,” Roth said. “Historically, that hasn’t been illegal.”
As Hartline sees it, the problem of algorithmic collusion has a simple solution: Ban all pricing algorithms except the no-swap-regret algorithms that game theorists have long favored. There may be practical ways to do this: In their 2024 work, Hartline and his colleagues devised a method for checking if an algorithm has a no-swap-regret property without looking at its code.
Hartline acknowledged that his preferred solution wouldn’t prevent all bad outcomes when no-swap-regret algorithms compete with humans. But he argued that scenarios like the one in Roth’s paper aren’t cases of algorithmic collusion.
“Collusion is a two-way thing,” he said. “It fundamentally must be the case that there are actions a single player can do to not collude.”
Either way, the new work still leaves many open questions about how algorithmic pricing can go wrong in the real world.
“We still don’t understand nearly as much as we want,” Pai said. “It’s an important question for our time.”
Original story reprinted with permission from Quanta Magazine, an editorially independent publication of the Simons Foundation whose mission is to enhance public understanding of science by covering research developments and trends in mathematics and the physical and life sciences.