留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

可驱动数字人重建与交互技术综述

高珮 孙浩然 黄继峰

高珮, 孙浩然, 黄继峰. 可驱动数字人重建与交互技术综述[J]. 中国传媒科技, 2025, (6): 64-67. doi: 10.19483/j.cnki.11-4653/n.2025.06.012
引用本文: 高珮, 孙浩然, 黄继峰. 可驱动数字人重建与交互技术综述[J]. 中国传媒科技, 2025, (6): 64-67. doi: 10.19483/j.cnki.11-4653/n.2025.06.012

可驱动数字人重建与交互技术综述

doi: 10.19483/j.cnki.11-4653/n.2025.06.012
详细信息
    作者简介:

    高珮(1988—),女,湖北咸宁,现就职于北京信通传媒有限责任公司,研究方向为新媒体与科技期刊建设;孙浩然(2004—),男,四川乐山,深圳大学电子信息工程拔尖创新实验班在读,研究方向为三维重建、数字人重建;黄继峰(2000—),男,江西萍乡,深圳大学数创意研究中心硕士生,研究方向为元宇宙、数字人重建。

  • 摘要: 【目的】随着信息技术发展及元宇宙兴起,高真实感、可交互三维数字人需求日增。本文旨在梳理可驱动数字人重建与交互的关键技术、成果及难题,为相关研究提供参考。【方法】本文首先阐述数字人技术背景、价值及其向深度学习驱动的转变。重点叙述三维数字人主流重建技术,包括传统方法及基于神经辐射场(NeRF)与三维高斯溅射(3DGS)的方法。随后梳理基于网格、NeRF、3DGS 等不同表征下的驱动技术,尤其关注语音驱动的进展。最后探讨实现实时、可定制化交互的关键环节及其难题,总结技术挑战并展望未来。【结果】分析表明,NeRF、3DGS 等新技术的应用显著提升了数字人重建的真实感与驱动的自然度。但在单目重建精度、动态细节捕捉、情感丰富度、个性化驱动灵活性、实时交互时延、多模态数据有效融合及高质量数据集构建等方面仍存诸多问题。【结论】可驱动数字人重建与交互技术正向更高保真度、更强交互性、更具表现力、更低延迟的方向发展。未来研究需持续攻克现有技术难题,加强对跨模态的理解与生成,提高个性化以及情感化的水平,并构建更完善的数据集与评价体系,以推动其在虚拟现实、新媒体等领域的广泛应用。

     

  • [1] Goodfellow I,Pouget-Abadie J,Mirza M,et al. Generative adversarial networks[J]. Communications of the ACM,2020,63(11):139-144.
    [2] Vaswani A. Attention is all you need[J]. Advances in Neural Information Processing Systems,2017.
    [3] Mildenhall B,Srinivasan P P,Tancik M,et al. Nerf:Representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM,2021,65(1):99-106.
    [4] Kerbl B,Kopanas G,Leimk hler T,et al. 3D Gaussian Splatting for Real-Time Radiance Field Rendering[J]. ACM Trans. Graph.,2023,42(4):139:1-139:14.
    [5] Newcombe R A,Izadi S,Hilliges O,et al. Kinectfusion:Real-time dense surface mapping and tracking[C]//2011 10th IEEE international symposium on mixed and augmented reality. Ieee,2011:127-136.
    [6] Loper M,Mahmood N,Romero J,et al. SMPL:A skinned multi-person linear model[M]//Seminal Graphics Papers:Pushing the Boundaries,Volume 2. 2023:851-866.
    [7] Blanz V,Vetter T. A morphable model for the synthesis of 3D faces[M]//Seminal Graphics Papers:Pushing the Boundaries,Volume 2. 2023:157-164.
    [8] Li T,Bolkart T,Black M J,et al. Learning a model of facial shape and expression from 4D scans[J]. ACM Trans. Graph.,2017,36(6):194:1-194:17.
    [9] Grassal P W,Prinzler M,Leistner T,et al. Neural head avatars from monocular rgb videos[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022:18653-18664.
    [10] Gafni G,Thies J,Zollhofer M,et al. Dynamic neural radiance fields for monocular 4d facial avatar reconstruction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021:8649-8658.
    [11] Jiang T,Chen X,Song J,et al. Instantavatar:Learning avatars from monocular video in 60 seconds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023:16922-16932.
    [12] Qian S,Kirschstein T,Schoneveld L,et al. Gaussianavatars:Photorealistic head avatars with rigged 3d gaussians[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024:20299-20309.
    [13] Cudeiro D,Bolkart T,Laidlaw C,et al. Capture,learning,and synthesis of 3D speaking styles[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019:10101-10111.
    [14] Fan Y,Lin Z,Saito J,et al. Faceformer:Speech-driven 3d facial animation with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022:18770-18780.
    [15] Guo Y,Chen K,Liang S,et al. Ad-nerf:Audio driven neural radiance fields for talking head synthesis[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2021:5784-5794.
    [16] Cho K,Lee J,Yoon H,et al. Gaussiantalker:Realtime talking head synthesis with 3d gaussian splatting[C]//Proceedings of the 32nd ACM International Conference on Multimedia. 2024:10985-10994.
    [17] Li J,Zhang J,Bai X,et al. Talkinggaussian:Structurepersistent 3d talking head synthesis via gaussian splatting[C]//European Conference on Computer Vision. Springer,Cham,2025:127-145.
  • 加载中
计量
  • 文章访问数:  1
  • HTML全文浏览量:  0
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 网络出版日期:  2025-07-18
  • 刊出日期:  2025-07-16

目录

    /

    返回文章
    返回