«

谷歌人工智能如何助力医疗教育行业变革

qimuai 发布于 阅读:4 一手编译


谷歌人工智能如何助力医疗教育行业变革

内容来源:https://research.google/blog/how-googles-ai-can-help-transform-health-professions-education/

内容总结:

谷歌AI助力医学教育变革:个性化学习工具或缓解全球医疗人才短缺

2025年8月27日,谷歌研究团队负责人Mike Schaekermann与医学教育负责人Paul Jhun联合宣布,该公司基于人工智能技术开发的医学教育工具取得突破性进展。面对世界卫生组织预测的"到2030年全球医疗卫生工作者缺口将超1100万人"的严峻形势,谷歌正通过AI技术重构医学专业人才培养体系。

最新研究成果显示,谷歌专为教育场景优化的LearnLM模型在医学教育领域展现出显著优势。在针对临床推理能力培养的案例研究中,研究团队通过跨学科协作工作坊和用户测试,开发出能模拟导师行为的AI辅导系统。该系统可提供个性化反馈、管理认知负荷并促进批判性思维,其教学效果获医学教育者评价为"接近优秀人类导师水平"。

在采用双盲对照的定量研究中,医学学生与医师教育者分别对LearnLM和基础模型进行290组对话测试。结果显示:医师教育者在所有五项评估指标中均更青睐LearnLM,特别是在教学法应用(+6.1%)和人类导师行为模拟(+6.8%)方面;医学学生则对LearnLM的交互体验给予最高评价(+9.9%)。

值得注意的是,所有模型训练与评估均未使用真实患者数据,严格遵循数据隐私保护原则。谷歌团队强调,AI工具旨在辅助而非替代人类教育者,未来将重点解决准确性保障、偏见消除以及人机协作等关键挑战。

目前该技术已集成至Gemini 2.5 Pro平台,谷歌正与全球医学教育机构合作,共同探索人工智能时代医疗卫生人才培育的新范式,致力于通过技术创新加速临床能力培养,最终提升全球医疗卫生服务水平。

中文翻译:

谷歌人工智能如何助力健康专业教育变革
2025年8月27日
Mike Schaekermann(研究负责人)与Paul Jhun(医学教育负责人),谷歌研究院

我们正探索谷歌人工智能模型在医学学习环境中作为辅助工具的应用价值。通过采用以学习者为中心和评估驱动的方法,我们致力于重新构想健康专业教育的未来。

全球卫生人力正面临严重短缺,预计到2030年 healthcare 工作者缺口将超过1100万。谷歌正在研究如何通过人工智能变革健康专业教育,通过探索AI模型如何作为有效的个性化学习工具应用于医学教育环境,以帮助弥补这一缺口。

今日我们发布两项相关研究。首先,发表于CHI 2025会议的《生成式AI在医学教育中的应用:基于医学生与临床推理AI辅导工具的案例研究》中,我们通过跨学科协同设计工作坊、快速原型开发和用户研究,采用定性方法深入理解医学学习者需求。其次,在最新版《LearnLM:优化Gemini学习能力》中,我们通过医学生和医师教育者的偏好评分,定量评估了基于Gemini架构并针对学习场景微调的LearnLM模型系列。两项研究均发现,学习者对能自适应学习节奏、具备导师特质(如提供建设性反馈和培养批判性思维)的AI工具有强烈兴趣。医师教育者认为LearnLM相比基础模型展现出更优的教学能力,其表现"更接近优秀的人类导师"。这些新功能现已集成于Gemini 2.5 Pro。

理解医学学习者
以学习者为中心的方法对我们开发负责任的人工智能工具至关重要,这些工具可扩展个性化学习路径并增强基于能力的教学。我们首先开展形成性用户体验(UX)研究来理解医学学习者需求。通过参与式设计流程,我们召集医学生、临床医师、医学教育者、UX设计师和AI研究员组成跨学科小组,共同探讨AI在该领域的应用机遇。基于这些洞察,我们开发了AI辅导原型,专门用于引导学习者通过合成临床案例进行推理训练。

随后我们通过8名参与者(4名医学生+4名住院医师)的定性研究评估原型效果。研究采用半结构化访谈和原型交互会话,通过视频会议远程进行。主题分析显示,学习者在临床推理技能培养中面临多重挑战,而生成式AI在适配个人学习风格、管理认知负荷、提供反馈和促进反思方面具有巨大潜力。

精准契合学习需求
基于这些发现,我们与医师教育者合作开展盲法可行性研究,定量比较LearnLM与Gemini 1.5基础模型在医学教育场景中的教学能力。我们设计了50个涵盖从血小板活化等 preclinical 主题到新生儿黄疸等临床主题的评估场景,反映医学教育核心标准。

医学生通过随机盲法分别与两个模型进行290轮对话,从(1)整体体验(2)需求契合度(3)交互愉悦度(4)内容可理解性四个维度评分。医师教育者则通过对话转录和场景规范,从(1)教学法体现(2)人类导师相似度(3)指令遵循(4)学习适配性(5)目标支持度五个标准进行7级评分。

结果显示:医师教育者在所有五项标准中均显著偏好LearnLM,其中"教学法体现"(+6.1%)和"人类导师相似度"(+6.8%)提升最突出;医学生最认可LearnLM的交互愉悦性(+9.9%)。本研究未使用任何真实患者数据,技术细节参见技术报告。

重塑健康专业教育
我们在诺贝尔论坛的MedEd on the Edge会议上分享了这项研究,并与国际医学教育界开展实践工作坊。我们认识到教育者兼具教学专家和知识领域探索者的双重角色。实现负责任的未来需应对三大挑战:确保准确性、减少偏见、保持人际互动与监督的核心地位。这要求我们重新评估专业能力标准,设计既能培养AI应用能力又包含AI基础知识的课程。生成式AI可成为促进深度理解和批判性思维的催化剂。以下是谷歌AI助力健康专业教育变革的若干范例:

结语
本研究为个性化学习体验的有效设计与实施奠定基础,通过重塑健康专业教育加速临床能力培养,最终改善健康 outcomes。我们承诺与健康专业教育界通力合作,审慎而负责任地培养未来 healthcare 专业人才,使其在AI增强的医疗环境中蓬勃发展。

致谢
本研究由谷歌研究院、Google for Health、Google DeepMind及合作团队共同完成。主要贡献者包括:Kevin McKee, Dan Gillick, Irina Jurenka等研究人员,特别感谢所有研究参与者。

英文来源:

How Google’s AI can help transform health professions education
August 27, 2025
Mike Schaekermann, Research Lead, and Paul Jhun, Medical Education Lead, Google Research
We explore the utility of Google’s AI models as helpful tools in medical learning environments. By employing a learner-centered and evaluation-driven approach, we seek to reimagine the future of education for health professionals.
The global health workforce is facing a critical shortage, with projections indicating a deficit exceeding 11 million healthcare workers by 2030. At Google, we are researching how AI can transform education for health professions to help close this gap with studies exploring how Google’s AI models can serve as effective personalized learning tools in medical learning environments.
Today we present two such studies. First, in “Generative AI for medical education: Insights from a case study with medical students and an AI tutor for clinical reasoning”, published at CHI 2025, we took a qualitative approach to understanding and designing for medical learners through interdisciplinary co-design workshops, rapid prototyping, and user studies. Next, in our latest update of “LearnLM: Improving Gemini for Learning”, we quantitatively assessed LearnLM — our Gemini-based family of models fine-tuned for learning — on medical education scenarios through preference ratings from both medical students and physician educators. Both studies revealed a strong interest in AI tools that can adapt to learners and incorporate preceptor-like behaviors, such as providing constructive feedback and promoting critical thinking. Physician educators rated LearnLM as demonstrating better pedagogy and behaving “more like a very good human tutor” compared to base models. These novel capabilities are now available with Gemini 2.5 Pro.
Understanding the medical learner
Employing a learner-centered approach has been critical in guiding our development of responsible AI tools that scale individualized learner pathways and augment competency-based approaches. Central to this approach, we first conducted formative user experience (UX) research to understand medical learner needs. Through a participatory design process, we began with a co-design workshop that convened an interdisciplinary panel of medical students, clinicians, medical educators, UX designers, and AI researchers to define opportunities for incorporating AI in this space. Insights from this session guided the development of an AI tutor prototype, explicitly designed to guide learners through clinical reasoning anchored on a synthetic clinical vignette.
We then evaluated the AI tutor prototype’s helpfulness in a qualitative user study with eight participants (4 medical students and 4 residents). The study aimed to elicit participant learning needs and challenges as well as their attitudes toward AI assistance in education. Each participant engaged in a 1-hour session with a UX researcher involving semi-structured interviews and interactive sessions with the prototype. All sessions were remote and conducted through video conferencing software. Participants accessed the prototype through a web link and shared their screen while interacting with the prototype.
Our thematic analysis of medical learner interviews revealed various challenges to acquiring clinical reasoning skills and the potential for generative AI in addressing these challenges. For example, medical learners expressed a significant interest in AI tools capable of adapting to unique individual learning styles and knowledge gaps. Participants also highlighted the importance of preceptor-like behaviors, such as managing cognitive load, providing constructive feedback, and encouraging questions and reflection.
Meeting medical learners where they are
Building on these insights, we conducted a blinded feasibility study with medical students and physician educators to quantitatively assess LearnLM's pedagogical qualities in medical education settings compared with Gemini 1.5 Pro as the base model. In collaboration with experts, we designed a set of 50 synthetic evaluation scenarios across a range of medical education subjects, from pre-clinical topics, such as platelet activation, to clinical topics, like neonatal jaundice, reflecting the core competencies and standards in medical education.
We recruited medical students from both preclinical and clinical phases of training to engage in interactive conversations with both LearnLM and the base model, in a randomized and blinded manner. Students used the evaluation scenarios to role-play as different types of learners across a range of learning goals and personas, generating 290 conversations for analysis. Each scenario provided learners with context to standardize the interaction as much as possible between both models, including a learning goal, grounding materials, a learner persona, a conversation plan, and the initial query used by the learner to start the conversation.
Students then rated model behavior by comparing the two interactions for each scenario side-by-side across four criteria: (1) overall experience, (2) meeting learning needs, (3) enjoyability, and (4) understandability. Physician educators rated model behavior by reviewing conversation transcripts and scenario specifications. For each scenario, educators reviewed the transcripts from both learner-model conversations side-by-side, and provided preference ratings across five criteria: (1) demonstrating pedagogy, (2) behaving like a very good human tutor, (3) instruction following, (4) adapting to the learner, and (5) supporting the learning goal. We collected a median of three independent educator reviews per conversation pair. All preference ratings were done in a randomized and blinded manner using 7-point scales, which reflected a spectrum of preference strengths including the option to express no preference between the two models.
Physician educators consistently preferred LearnLM across all five of the comparison criteria. They judged LearnLM particularly positively in terms of demonstrating better pedagogy (on average, +6.1% on our rating scale) and for behaving “more like a very good human tutor” (+6.8%). When we simply look at whether educators expressed any preference one way or the other — regardless of its magnitude — LearnLM emerged as their choice in a clear majority of assessments across every criterion. Medical students indicated the strongest positive preference in terms of LearnLM being more enjoyable to interact with (on average, +9.9% on our rating scale). Student preferences were less pronounced for the other three comparison criteria, while directionally also favoring LearnLM.
This study points to LearnLM’s potential to transform education and learning paradigms and scale a competent health workforce. None of the data used for model development or evaluation in this study included real patient data. See the tech report for modeling details.
Reimagining health professions education
We recently shared this research at the MedEd on the Edge conference at the Nobel Forum and facilitated a hands-on workshop with the international medical education community to explore these possibilities. We recognize the dual role of educators as both pedagogical experts and explorers in this rapidly evolving knowledge domain. Realizing a responsible future requires careful attention to challenges such as ensuring accuracy, mitigating bias, and maintaining the crucial role of human interaction and oversight. It underscores the need to re-evaluate competencies and entrustable professional activities, and for curricula that cultivate adaptive expertise, focusing not only on AI applications in education, but also on teaching foundational understanding of AI itself. At this convergence, generative AI can serve as a catalyst for the desired productive struggle to foster deeper understanding and critical thinking. As the journey has only just begun, below are a few examples of how Google’s AI can potentially transform health professions education.
Conclusion
This research continues to lay the groundwork toward the effective design and implementation of personalized learning experiences, offering an opportunity to accelerate clinical competency and ultimately improve health outcomes by reimagining health professions education. We are committed to partnering with the health professions education community to thoughtfully and responsibly prepare future healthcare professionals to thrive in an AI-augmented healthcare landscape.
Acknowledgements
The research described here is a joint effort across Google Research, Google for Health, Google DeepMind, and partnering teams. The following researchers contributed to this work: Kevin McKee, Dan Gillick, Irina Jurenka, Markus Kunesch, Kaiz Alarakyia, Miriam Schneider, Jenn Sturgeon, Maggie Shiels, Amy Wang, Roma Ruparel, Anna Iurchenko, Mahvish Nagda, Julie Anne Séguin, Divya Pandya, Patricia Strachan, Renee Wong, Renee Schneider, Viknesh Sounderajah, Pete Clardy, Garth Graham, Megan Jones Bell, Michael Howell, Jonathan Krause, Christopher Semturs, Dale Webster, Avinatan Hassidim, Joëlle Barral, Ronit Levavi Morad and Yossi Matias. Special thanks to participants who contributed to these studies.

谷歌研究进展

文章目录


    扫描二维码,在手机上阅读