«

情境数据对企业人工智能项目至关重要。

qimuai 发布于 阅读:3 一手编译


情境数据对企业人工智能项目至关重要。

内容来源:https://aibusiness.com/generative-ai/contextual-data-is-important-for-enterprise-ai

内容总结:

企业部署生成式AI面临数据挑战:上下文与数据质量成关键

在近期举行的2026年高德纳数据与分析峰会上,多家企业分享了在部署生成式人工智能(AI)应用过程中的核心挑战:构建高质量的数据基础,尤其是为AI系统提供具备充分上下文信息的数据环境,已成为实现AI价值的关键前提。

数据发现与治理困境
拥有140年历史的跨国企业PPG工业公司在其AI推进过程中,面临数据可发现性差、数据所有权不明确、数据来源分散(多达170个来源)以及元数据系统脱节等问题。公司数据与分析高级IT经理鲍勃·豪登指出,这些问题直接影响了AI计划的规模化推进和投资回报率的提升。为此,PPG引入了数据目录供应商Atlan进行数据治理优化。

“上下文”成为AI应用新焦点
豪登强调,当前AI部署的最大挑战并非数据不足,而是如何让数据在更丰富的上下文环境中被理解,从而使AI能够提供可靠答案。德勤全球AI研究所负责人比娜·阿曼纳特进一步指出,随着AI应用向流数据和非结构化数据(如图像、文档)扩展,企业需要更丰富的数据类型来提升AI的稳健性。

结构化与非结构化数据需协同
美国大型银行控股公司Truist Financial的AI与数据交付高级副总裁萨拉瓦南·巴拉苏布拉马尼安指出,企业必须确保结构化数据(如表格数据)与非结构化数据保持一致和同步。他以银行为例说明,若同一客户在不同文档中的姓名记录不统一(如“William”和“Will”),可能导致AI系统输出矛盾结果,甚至引发法律风险。

数据质量决定AI效能
国际体育博彩公司Entain PLC的数据信任与完整性全球负责人保罗·贝尔表示,当AI系统获得具有正确上下文、且经过对齐的数据时,其对语境的理解可能比人类“更快速、更智能”。反之,若数据缺乏上下文或身份识别未统一,数据质量问题将在AI系统或智能体中反复出现。

行业共识
企业普遍认识到,生成式AI的成功部署需以坚实的数据环境为基础,其中数据的上下文丰富性、结构化与非结构化数据的对齐能力,已成为衡量数据准备度的关键指标。

中文翻译:

由谷歌云赞助
选择您的首个生成式AI应用场景
要开始应用生成式AI,首先应关注能够改善人类信息交互体验的领域。
企业强调,在向AI系统和智能体提供数据时,必须确保其具备恰当的上下文关联。

奥兰多讯——作为一家拥有140年历史的企业,匹兹堡平板玻璃工业公司在运营中积累了丰富多元的数据资源。这家业务遍布多国的企业曾面临数据素养与数据可发现性方面的挑战,并希望通过解决这些问题来推进其AI生产力规模化目标。

"我们长期存在数据可发现性差、数据产品权属不清的问题,"PPG数据与分析高级IT经理鲍勃·豪登在3月9日举行的2026年高德纳数据与分析峰会上表示,"数据溯源存在断层,数据源多头分散。仅数据库中就存在170个数据源。"

除上述问题外,PPG的元数据系统也处于割裂状态,这使其难以规模化推进日常AI计划,也无助于实现2026年提升现有数据投资回报率的目标。

为破局解难,PPG最终选择了数据编目服务商Atlan。

PPG的转型历程折射出企业部署AI应用的普遍趋势:若缺乏坚实的数据环境,则难言成功。这一挑战正促使企业认识到,不仅需要完善的数据环境,还必须确保数据包含可供AI系统解析的上下文信息。

"我认为当前AI面临的核心挑战正是上下文关联,"豪登在接受采访时坦言,"我们拥有海量数据。过去数据不准确是主要问题,但现在如何让AI在更宏大的语境中理解数据并提供可靠答案,或许才是我们最大的挑战。"

AI系统对数据上下文要求的提升,源于AI智能体所用数据类型的演变。

"如今我们面对的AI系统需要流数据或非结构化数据,并渴求更广泛的数据访问权限,"德勤全球AI研究院负责人比娜·阿曼纳特指出。结构化数据指可轻松录入电子表格的数据,而非结构化数据则来自图像、书面文档等多元渠道。"要使AI系统强健,必须引入非结构化数据。AI及生成式AI需要更多样化的数据营养。"

全美最大银行控股公司之一Truist Financial的AI与数据交付高级副总裁萨拉瓦南·巴拉苏布拉马尼安在采访中表示,为AI系统提供带上下文的数据固然重要,但这仅是起点。

企业还需确保结构化与非结构化数据协调一致。他举例说明,像Truist这样的银行若对同一客户的相同问题给出两种不同答案,可能引发法律纠纷。

"如果需要调取多份客户文档来生成答案,那么这些文档必须清晰规范,"巴拉苏布拉马尼安强调,"身份识别必须在文档层面实现统一。不能此处用威廉,彼处用威尔。"

他补充道,若结构化与非结构化数据未能对齐,缺乏上下文关联或身份识别解析,数据质量问题将在AI系统或智能体中反复出现。

为AI智能体提供正确的数据、恰当的上下文以及结构化与非结构化数据间的同步协调,可使系统获得超越人类的理解速度。

"它们对上下文的理解可能更轻松、更迅速、更智能,"国际体育博彩公司Entain PLC的数据信任与完整性全球主管保罗·贝尔在接受采访时如此评价。

英文来源:

Sponsored by Google Cloud
Choosing Your First Generative AI Use Cases
To get started with generative AI, first focus on areas that can improve human experiences with information.
Enterprises highlight the need for the right type of context when providing data to AI systems and agents.
ORLANDO -- As a 140-year-old business, Pittsburg Plate Glass Industries is well supplied with a diverse array of data across its operations. With a presence in multiple countries, the company faced challenges with data literacy and data discoverability and sought to address them to scale its AI productivity goal.
"We've had poor discoverability, unclear data product ownership," said PPG's Bob Howden, senior IT manager of data and analytics, during a presentation at the Gartner Data & Analytics Summit 2026 on March 9. "We have gaps in the lineage; we have multiple sources. We have 170 sources in our database."
Along with these problems, PPG also had a disconnected metadata system that would have made it challenging to scale or achieve its goal of accelerating its day-to-day AI initiatives or boost ROI on existing data in 2026.
To address the situation, PPG turned to data cataloging vendor Atlan.
PPG's journey highlights a trend in enterprises seeking to deploy AI applications: without a strong data environment, success is hard. This challenge is leading enterprises to realize that not only is a good data environment needed, but that data must also include context that the AI system can feed on.
"The challenge I think we have today with AI is really the context," Howden said in an interview. "We have a lot of data. It used to be data inaccuracy was a problem but understanding that data in a larger context and making and providing AI the ability to give solid answers is probably our biggest challenge."
The need for better context in the data on which AI systems depend arises because the types of data AI agents use have changed.
"Now we're looking at versions of AI that need either streaming data or unstructured data and more access to data," said Beena Ammanath, lead at the Global Deloitte AI Institute. Structured data includes data that can be easily entered into spreadsheets. Unstructured data is data from images, written documents, and other sources. "You need unstructured data to make your AI robust. AI and generative AI need more varieties of data."
While providing context to the data feeding the AI system is important, that is only the beginning, said Saravanan Balasubramaniam, senior vice president of AI and data delivery at Truist Financial, one of the country's biggest bank holding companies, during an interview.
Enterprises should also ensure that structured and unstructured data align, he said. He noted, for example, that if a bank, such as Truist, gives two different answers to the same client for the same question, it can lead to legal action.
"If I have to pull in multiple documents about the customer and get an answer, then the document should be clear, clean," Balasubramaniam said. "The identity resolution has to happen on the document level. The name cannot be William in one place and Will in another place."
He added that without alignment between structured and unstructured data and without context or identity resolution, data quality problems will likely recur within either the AI system or the agent.
Providing an AI agent with the right data, with the correct context and synchronization within structured and unstructured data, could lead the system to understand faster than a human can.
"Their perception of the context might be easier, quicker, smarter," said Paul Bell, global head of data trust and integrity at Entain PLC, an international sports betting and gambling company, in an interview.

商业视角看AI

文章目录


    扫描二维码,在手机上阅读