«

动画师与AI研究者如何打造《亲爱的楼上邻居》

qimuai 发布于 阅读:26 一手编译


动画师与AI研究者如何打造《亲爱的楼上邻居》

内容来源:https://blog.google/innovation-and-ai/models-and-research/google-deepmind/dear-upstairs-neighbors/

内容总结:

动画短片《亲爱的楼上邻居》亮相圣丹斯电影节,探索AI与艺术家协作新范式

当地时间1月20日,由动画艺术家与谷歌DeepMind研究人员合作完成的实验性动画短片《亲爱的楼上邻居》,在2025年圣丹斯电影节故事论坛进行首映。该论坛专注于展示支持视觉叙事的前沿工具与技术。

影片讲述了一位名为Ada的年轻女性,因楼上邻居持续制造噪音而难以入眠,在现实与幻想的交织中展开一场寻求宁静的“奇幻斗争”。故事灵感源于该片导演、皮克斯前动画师Connie He的个人经历。

艺术驱动:确立独特视觉风格

创作团队始终坚持艺术主导。导演Connie He率先完成故事板,并由获奖美术指导Yingzong Xin负责概念艺术与角色设计。Ada的角色造型比例夸张、线条锐利,极具辨识度。其卧室以冷色调渲染宁静感,而随着剧情推进,她的幻觉场景则以粗粝笔触与霓虹色调呈现,画面风格随情绪波动向抽象表现主义演变,这些高度风格化的视觉表达是叙事的核心,也是传统动画技术难以实现的挑战。

技术破局:定制化AI工具赋能创作

为将独特的艺术构想转化为现实,研究团队开发了定制化AI工具链。通过使用团队自创的艺术素材对Veo和Imagen模型进行“微调”,AI得以深入理解并学习Ada的角色特征、两点透视等深层艺术概念,而不仅仅是色彩与纹理。例如,在Ada转头时,AI能智能调整发型轮廓,始终维持其标志性的二维剪影特征。

针对动画表演中细微的动作节奏、表情 timing 及精确构图,仅靠文本提示无法实现精准控制。团队受动画师工作方式启发,开发了“视频到视频”的全新工作流程:动画师可先用熟悉的工具(如Maya、TV Paint)制作粗稿动画,AI模型再将其转化为最终风格化视频,在精确控制与创意即兴之间取得平衡。

迭代优化:人机协同打磨细节

影片的每个镜头都经过多轮迭代,并非“一键生成”。团队建立了类似传统电影制作的“日审”反馈机制,并开发了局部细化工具,允许对视频特定区域进行可控调整。例如,为完善Ada头发的轮廓,研究人员可通过添加粗略遮罩提示,让AI在指定区域生成贴合整体画面的发簇。最终,所有镜头均利用Veo的升级功能提升至4K分辨率,在增添丰富细节的同时,完整保留了艺术风格的每一处微妙之处。

共创价值:跨学科碰撞激发新可能

该项目证明了生成式AI与专业动画流程深度融合的潜力。艺术家通过直接参与实验研究获得了新的创作能力,并以其专业视角反哺技术发展;研究人员则通过扮演“技术艺术家”的角色,快速原型化解决方案,突破艺术与技术壁垒。团队表示,此次合作积累了宝贵经验,未来将继续致力于与专业艺术家和电影制作人共同构建服务于创作的生成式AI工具。

《亲爱的楼上邻居》作为一次前沿探索,为人机协同创作提供了具体范例,展现了在保持艺术控制力的前提下,AI如何帮助实现前所未有的视觉风格与叙事自由度。

中文翻译:

动画师与AI研究者如何创作《亲爱的楼上邻居》

今天,我们的动画短片《亲爱的楼上邻居》在圣丹斯电影节首映。这部作品将在圣丹斯学会故事论坛亮相——这是一个专注于以艺术家为本、支持视觉叙事的技术创新平台。

《亲爱的楼上邻居》讲述了年轻女子艾达的故事:她渴望安睡整夜,却被楼上邻居的喧闹持续侵扰。当她试图想象楼上究竟在发生什么时,现实逐渐滑向幻想,一场关于宁静与理智的史诗级战役就此展开。

这部影片是动画界资深创作者与谷歌DeepMind研究团队的共同成果。导演康妮·何(皮克斯前成员)带领的动画团队,与AI研究者们怀着共同目标携手合作:探索生成式工具如何融入艺术家的创作流程。

导演康妮·何基于亲身经历的邻里噪音困扰构思了这个故事。在她的故事板中,艾达的幻觉随着夜深而愈发荒诞离奇。

角色设计师辛应宗为女主角艾达创造了独特造型:夸张的比例与棱角分明的形体语言,塑造出古怪又鲜明的形象。

艾达的面部极具表现力。角色设定稿由辛应宗绘制。

艾达的卧室采用冷色调渲染,传递出宁静、舒适与庇护感。场景设计由辛应宗完成。

艾达的幻觉场景以粗粝风格与霓虹色调构成,与她卧室的"现实世界"形成鲜明对比。概念设计由辛应宗创作。

写意画风随着剧情时刻变幻,通过色彩与质感的流动呈现艾达起伏的情绪。概念设计由辛应宗创作。

在最高潮的段落,抽象表现主义风格席卷整个场景。概念设计由辛应宗创作。

团队始终秉持着这样的创作理念:既要释放生成式AI的创作潜能,又要让动画艺术家保持艺术主导权。为确立影片视觉基调,康妮先完成故事板创作,再邀请获奖无数的设计师辛应宗操刀概念艺术与角色设计。团队承诺在镜头制作全过程中忠实贯彻这一艺术构想。

表现主义视觉风格是本片叙事核心——而这在传统动画制作中极难实现。我们原以为AI能填补技术鸿沟,但很快发现这些风格过于独特,设计选择又极为精确,研究者必须开发全新功能,才能实现定制化控制,让影片真正鲜活起来。

定制视觉风格

我们面临的第一个挑战,是让每个镜头既符合艾达的角色设定,又契合不同场景的写意画风。为实现高质量的画面一致性,研究者开发了定制化工具,让艺术家能够基于他们的艺术作品微调Veo和Imagen模型——仅需少量示例图像,就能让AI理解全新的视觉概念。

经微调后Imagen生成的艾达形象。定制化模型帮助整个团队深入探索角色特质。

左图:辛应宗绘制的概念画。右图:经微调后Veo生成的风格化动画。Veo从概念艺术中学到的内容令我们惊喜:不仅是色彩与纹理等表层细节,更包括两点透视等深层艺术概念。

上图:艾达的角色设计严格遵循二维规则——标志性的蓬松发型与凌乱发髻必须始终构成她的剪影轮廓,绝不遮挡面部。下图左:3D建模的发型无法在所有角度都保持正确,因为立体形态会破坏二维规则。下图右:经艾达图像微调后的Veo完美化解了这一矛盾,随着头部转动自然调整形状,始终保持剪影准确性。

以画代言,而非以文生画

第二个挑战在于精确控制每个镜头的内容与动态。我们深知,仅凭文字提示永远无法精准掌控:艾达困倦手指的敲击节奏、面部表情的喜剧性时机,或是镜头转场的精确构图。我们需要向AI模型传达这种精微层面的创作意图。研究者从动画师的视觉沟通方式中汲取灵感——他们常通过绘画、草图或表演来呈现构想。我们开发了创新的视频到视频工作流,让动画师能用擅长工具创作粗略动画,直观传达创作意图。随后AI模型将这些动画转化为完全风格化的成片,在精确控制与创意即兴之间取得可调节的平衡。

使用经微调的Veo进行文生视频,虽能生成艾达的形象,但动作却随机失控且时常怪异。纯文字提示无法满足叙事性动画电影所需的精微控制。

为创造足以承载故事的细腻表演,动画师采用了传统方法。动画师本·奈特在Maya中为此场景制作了粗略3D动画,研究者安迪·科宁则运用微调后的Veo模型将其转化为最终效果。

视频到视频工作流让每位艺术家都能在舒适区使用最擅长的动画工具。动画师马蒂亚斯·布雷特霍尔茨用TV Paint创作了这段粗略2D动画,研究者福雷斯特·科尔则通过定制化ComfyUI工作流,运用微调版Imagen逐帧转化为最终画面。

动画师史蒂文·赵在Maya中完成艾达的动画并创作动态低多边形特效,研究者艾伦·蒋与导演康妮·何使用微调后的Veo和Imagen模型将这些元素转化为表现主义风格。不断变化的颜料质感以断奏式节奏强化了动作张力。

精益求精的迭代

即便通过微调与视频到视频工作流实现了精细控制,我们的每个最终镜头也绝非"一键生成"。正如所有电影制作,我们通过"日审片会"对每个镜头进行批判性复盘,经过多轮反馈打磨每个细节。为避免每次迭代都需从头生成,我们开发了局部优化工具,支持对视频特定区域进行可调控的精细化编辑。

为呈现艾达幻境中嚎叫的狗,我们从辛应宗的概念画出发,运用Veo图生视频功能赋予其生命。Veo的初始版本(未经微调)过于写实,因此我们采用微调版Veo使镜头更贴近目标视觉风格。视频到视频工作流让我们能在Veo与Premiere等传统工具间自由切换。

结合微调版Veo与视频到视频工作流,我们得以对狗的造型及周围写意效果进行迭代设计,以前所未有的自由度探索风格变体。

动画总监卡西迪·柯蒂斯在Maya中为此镜头制作粗略3D动画,研究者艾丽卡·吕通过微调Veo模型转化为最终效果。为优化艾达发型的剪影,吕添加了粗略遮罩标示需要更多发量的区域,并运用Veo即兴生成一缕完美融入画面的发丝。

最后,为适配大银幕放映,我们使用Veo的超分辨率功能将最终镜头提升至4K画质。在艺术家们的专业反馈指导下,研究者精心调整模型参数,在增添丰富细节的同时完整保留艺术风格的每一处精妙笔触。Veo 4K超分辨率模型已在Flow平台上线,并将于本月登陆Google AI Studio和Vertex AI,以满足电影制作人的实际需求。

每个镜头都带来独特挑战。在整个制作过程中,这支跨学科团队开发出多种工作流,将手工动画的精确控制与生成式AI的风格弹性、扩展能力相结合。我们的AI模型不仅制造了令人捧腹的穿帮镜头,更时常以出乎意料的绝妙创意带来惊喜。通过日复一日凝聚匠心打磨每个镜头,我们收获了宝贵经验:艺术家们通过直接接触实验性研究获得全新创作能量,并运用专业技艺与艺术眼光助力技术发展;研究者们则化身技术艺术家,通过快速原型设计突破艺术与技术壁垒。我们满怀热忱,将继续与专业艺术家、电影人携手共建生成式AI的未来。

英文来源:

How animators and AI researchers made ‘Dear Upstairs Neighbors’
Today, our animated short film, “Dear Upstairs Neighbors,” previews at the Sundance Film Festival. The film will be showcased at the Sundance Institute’s Story Forum, a space focused on artist-first tools and technologies supporting visual storytelling.
“Dear Upstairs Neighbors” is the story of a young woman, Ada, who is desperate for a good night’s sleep but kept awake by her exceedingly noisy neighbors. As she struggles to imagine what could be causing the cacophony upstairs, reality drifts into fantasy, and an epic battle for peace and sanity ensues.
The film is a collaboration between animation veterans, including director and Pixar alum Connie He, and researchers at Google DeepMind, united by a shared goal of exploring how generative tools might fit in with artists' creative processes.
Director Connie He developed the story based on her personal experience with noisy neighbors. In her storyboards she envisioned a series of hallucinations that get more unhinged and ridiculous as the night progresses.
For our main character, Ada, production designer Yingzong Xin created a design that’s quirky and unique, with pushed proportions and an angular shape language.
Ada’s face is extremely expressive. Character model sheet by Yingzong Xin.
Ada’s bedroom is rendered in cool colors, conveying a sense of calm, comfort and sanctuary. Set design by Yingzong Xin.
Ada’s hallucinations have a rough style and neon palette that distinguishes them from the “real world” of her bedroom. Concept art by Yingzong Xin.
The painterly style changes from moment to moment, expressing Ada’s changing emotions through color and texture. Concept art by Yingzong Xin.
In the most intense moments, the abstract expressionist style grows to dominate the entire scene. Concept art by Yingzong Xin.
From the start, the team aspired to empower animation artists to benefit from the creative potential of generative AI without sacrificing artistic control to its inherent unpredictability. To define her vision for this film, Connie developed the storyboards, and enlisted award-winning production designer Yingzong Xin to create concept art and character designs. We committed to staying faithful to this artistic vision throughout shot production.
The expressionistic visual styles are central to the storytelling — and extremely difficult to achieve in traditional animation. We expected that AI could help fill the gap, but soon found that these styles were so unique, and our design choices so specific, that our researchers would have to develop new capabilities to provide the customization and control that we needed to bring the film to life.
Tune for new visual styles
Our first challenge was to produce shots consistent with Ada’s character design and the painterly styles that defined each scene. To achieve high quality and consistency, our researchers built tools that allowed our artists to fine-tune custom Veo and Imagen models on their artwork, teaching the models new visual concepts from just a few example images.
Images of Ada generated by Imagen after fine-tuning. The fine-tuned model helped the whole team explore Ada as a character.
Left: paintings by Yingzong Xin. Right: stylized animated video generated by Veo after fine-tuning. What Veo learned from our concept art surprised us: not just superficial details like color and texture, but deep artistic concepts like two-point perspective.
Top: Ada’s character design follows strictly two-dimensional rules: her characteristic hair poof and messy bun must always be part of her silhouette, never obscuring her face. Bottom left: a 3D sculpture of Ada’s hair can’t possibly look correct from every angle, because the solid form violates those 2D rules. Bottom right: Veo, after fine-tuning on images of Ada, seamlessly resolves the conflict, smoothly adapting the shapes to keep the silhouette correct as the head turns.
Show, don’t type
Another challenge was precisely controlling the content and motion of each shot. We knew that text prompting alone would never let us control the rhythm of Ada’s sleepy fingers typing, the comedic timing of her facial expressions, or the exact framing of a camera reveal. We needed a way to communicate that level of nuance and specificity to our AI models. Our researchers drew inspiration from how our animators communicate visually, by drawing, painting or acting out scenes. We developed novel video-to-video workflows, which allowed our animators to convey their intentions visually by creating rough animation in their tool of choice. Our models then transformed that animation into fully stylized videos that follow the input motion, with an adjustable balance between tight control and creative improvisation.
Using text-to-video with the fine-tuned Veo model produced scenes that looked like Ada, but their movement was random, uncontrolled, and often bizarre. Text alone can’t convey the nuance and specificity needed for narrative animated filmmaking.
To create a nuanced performance strong enough to carry the story, our animators used traditional methods. Animator Ben Knight created rough 3D animation for this scene in Maya, and researcher Andy Coenen used fine-tuned Veo models to transform it into the final look.
The video-to-video approach allowed each artist to work in their comfort zone, using their favorite animation tools. Animator Mattias Breitholtz created this rough 2D animation using TV Paint, and researcher Forrester Cole transformed it into the final look frame by frame, using fine-tuned versions of Imagen in a custom ComfyUI workflow.
Animator Steven Chao animated Ada and created dynamic low-poly effects in Maya, and researcher Ellen Jiang and director Connie He used fine-tuned Veo and Imagen models to transform these elements into the expressionist look. The staccato rhythm of the changing paint texture adds to the intensity of the action.
Iterate toward perfection
Even with the control provided by fine-tuning and video-to-video workflows, none of our final shots were created in a single “one-click” generation. Just as in any film production, we critiqued each shot in our “dailies” reviews, going through several rounds of feedback to get every detail right. To iterate on a shot without re-generating from scratch every time, we built tools for localized refinement, allowing us to edit specific regions of a video with an adjustable level of control.
To create Ada’s hallucination of a howling dog, we started with a concept painting by Yingzong Xin, and used Veo image-to-video to bring it to life. Veo’s first pass (without fine-tuning) was too photorealistic for our film; so we used the fine-tuned version of Veo to bring the shot closer to our intended visual style. The video-to-video workflow allowed us to switch freely between Veo and traditional tools like Premiere.
Using fine-tuned Veo with video-to-video workflows allowed us to iterate on the design of both the dog and the painterly effects around it, exploring stylistic variations with unprecedented freedom and control.
Supervising animator Cassidy Curtis created rough 3D animation for this shot in Maya, and researcher Erika Lu fine-tuned a Veo model to transform it into the final look. To improve the silhouette of Ada’s hair, Lu added a rough mask to indicate the region where more hair was needed, and used Veo to improvise an extra tuft of hair there that fits seamlessly into the rest of the shot.
Finally, to prepare our film for the big screen, we used Veo's upscaling capability to bring our final shots to 4K resolution. Guided by our artists' critique, our researchers carefully tuned the model's behavior to add rich detail that preserved every nuance of the artistic style. The Veo 4K upscaling model is available in Flow and coming to Google AI Studio and Vertex AI later this month to meet the real-world needs of filmmakers.
Each shot presented unique challenges, and over the course of production, our multi-disciplinary team developed several workflows combining the precise control of hand-crafted animation with the stylistic flexibility and scalability of generative AI. Not only did our AI models produce hilarious bloopers, they often surprised us with unexpectedly beautiful and creative solutions. We learned valuable lessons from coming together every day to produce each shot with fine-grained artistic intention and care. Our artists found new creative powers through direct access to experimental research, and used their craft and perspective to help shape its development. Our researchers gained hands-on experience as technical artists, rapidly prototyping solutions to break through artistic and technological barriers. We’re excited to continue our mission to build generative AI with and for professional artists and filmmakers.

谷歌新消息

文章目录


    扫描二维码,在手机上阅读