doi:10.19483/j.cnki.11-4653/n.2025.05.001

[1]

决明子 . DeepSeek 正在中文互联网建造“幻觉长城”[EB/OL]. （2025-02-07）[2025-04-25]. https：//mp.weixin.qq.com/s/aMy99RcCq62D9JvTgTUi7A.

[2]

Kalai A T，Vempala S S. Calibrated language models must hallucinate[C]. Proceedings of the 56th Annual ACM Symposium on Theory of Computing，2024：160-171.

[3]

Vectara. DeepSeek-R1 hallucinates more than DeepSeekV3[EB/OL]. （2025-01-30） [2025-04-25]，https：//www.vectara.com/blog/deepseek-r1-hallucinates-morethan-deepseek-v3.

[4]

-42.

[4]

Nicola J. AI hallucinations can’t be stopped—but these techniques can limit their damage[J].Nature. 2025，637（8047）：778-780.

[5]

张铮，刘晨旭 . 大模型幻觉：人机传播中的认知风险与共治可能 [J]. 苏州大学学报（哲学社会科学版），2024，45 （5）：171-180.

[6]

经羽伦，张殿元 . 生成式 AI 幻象的制造逻辑及其超真实建构的文化后果 [J]. 山东师范大学学报（社会科学版），2024，69（5）：113-126.

[7]

张新生，王润周，马玉龙 . AIGC 背景下虚假信息治理挑战、机会与策略研究 [J/OL]. 情报科学，1-23[2025-06-05].http：//kns.cnki.net/kcms/detail/22.1264.G2.20241111.1002.024.html.

[8]

Chakraborty N，Ornik M，Driggs-Campbell K. Hallucination detection in foundation models for decisionmaking： A flexible definition and review of the state of the art[J]. ACM Computing Surveys，2025，52（7）：1-35.

[9]

Wu J，Gan W，Chen Z，et al. Multimodal large language models：A survey[C]. 2023 IEEE International Conference on Big Data. IEEE，2023：2247-2256.

[10]

Xi Z，Chen W，Guo X，et al. The rise and potential of large language model based agents： A survey[J]. Science China Information Sciences，2025，68（2）：101-121.

[11]

Gong R，Huang Q，Ma X，et al. MindAgent：Emergent Gaming Interaction[C]. Findings of the Association for Computational Linguistics：NAACL 2024，2024：3154-3183.

[12]

Zhang J，Huang J，Jin S，Lu S. Vision-language models for vision tasks：A survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence，2024，46（8）：5625-5644.

[13]

El-Mallakh R S，Walker K L. Hallucinations，psuedohallucinations，and parahallucinations[J]. Psychiatry：Interpersonal and Biological Processes，2010，73（1）：

[14]

Chakraborty N，Ornik M，Driggs-Campbell K. Hallucination detection in foundation models for decisionmaking： A flexible definition and review of the state of the art[J]. ACM Computing Surveys，2025，52（7）：1-35.

[15]

Sahoo P，Meharia P，Ghosh A，et al. A ComprehensiveSurvey of Hallucination in Large Language，Image，Video and Audio Foundation Models[C]. Findings of the Association for Computational Linguistics：EMNLP 2024. 2024：11709-11724.

[16]

Chen X，Wang C，Xue Y，et al. Unified Hallucination Detection for Multimodal Large Language Models[C]. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. 2024，1: 3235-3252.

[17]

Hicks M T，Humphries J，Slater J. ChatGPT is bullshit[J]. Ethics and Information Technology，2024，26（2）：1-10.

[18]

Huang L，Yu W，Ma W，et al. A survey on hallucination in large language models：Principles，taxonomy，challenges，and open questions[J]. ACM Transactions on Information Systems，2025，43（2）：1-55.

[19]

Ji Z，Lee N，Frieske R，et al. Survey of hallucination in natural language generation[J]. ACM computing surveys，2023，55（12）：1-38.

[20]

Huang L，Yu W，Ma W，et al. A survey on hallucination in large language models：Principles，taxonomy，challenges，and open questions[J]. ACM Transactions on Information Systems，2025，43（2）：1-55.

[21]

Chen X，Wang C，Xue Y，et al. Unified Hallucination Detection for Multimodal Large Language Models[C]. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. 2024，1: 3235-3252.

[22]

Guerreiro N M，Alves D M，Waldendorf J，et al. Hallucinations in large multilingual translation models[J]. Transactions of the Association for Computational Linguistics，2023，11：1500-1517.

[23]

Zheng L，Chiang W L， Sheng Y，et al. Judging llmas-a-judge with mt-bench and chatbot arena[J]. Advances in Neural Information Processing Systems，2023，36：46595-46623.

[24]

Adlakha V，Ghader B P，Lu X H，et al. Evaluating correctness and faithfulness of instruction-following models for question answering[J]. Transactions of the Association for Computational Linguistics 2024，12：681-699.

[25]

Dziri N，Milton S，Yu M，et al. On the Origin of Hallucinations in Conversational Models：Is it the Datasets or the Models?[C]. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics：Human Language Technologies， 2022：5271-5285.

[26]

Das S，Saha S，Srihari R K. Diving Deep into Modes of Fact Hallucinations in Dialogue Systems[C]. Findings of the Association for Computational Linguistics: EMNLP 2022，2022：684-699.

[27]

Qiu Y，Ziser Y，Korhonen A，et al. Detecting and Mitigating Hallucinations in Multilingual Summarisation[C]. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing，2023：8914-8932.

[28]

Yuan S，Faerber M. Evaluating Generative Models for Graph-to-Text Generation[C]. Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing，2023：1256-1264.

[29]

Li Y， Du Y， Zhou K， et al. Evaluating Object Hallucination in Large Vision-Language Models[C]. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing，2023：292-305.

[30]

刘泽垣，王鹏江，宋晓斌，等 . 大语言模型的幻觉问题研究综述 [J]. 软件学报，2025，36（3）：1152-1185.

[31]

Lebret R，Grangier D，Auli M. Neural Text Generation from Structured Data with Application to the Biography Domain[C]. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing，2016：1203-1213.

[32]

Lee K，Ippolito D，Nystrom A，et al. Deduplicating Training Data Makes Language Models Better[C]. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics，2022，1：8424-8445.

[33]

Rashkin H，Reitter D，Tomar G S，et al. Increasing Faithfulness in Knowledge-Grounded Dialogue with Controllable Features[C]. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing，2021，1：704-718.

[34]

Das B C，Amini M H，Wu Y. Security and privacy challenges of large language models：A survey[J]. ACM Computing Surveys，2025，57（6）：1-39.

[35]

Lin S，Hilton J，Evans O. TruthfulQA：Measuring How Models Mimic Human Falsehoods[C]. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics，2022，1：3214-3252.

[36]

Kasai J，Sakaguchi K，Le Bras R，et al. Realtime qa：What’s the answer right now?[J]. Advances in neuranformation processing systems，2023，36：49025-49043.

[37]

Paullada A，Raji I D，Bender E M，et al. Data and its （dis） contents：A survey of dataset development and use in machine learning research[J]. Patterns， 2021， 2（11）：1-14.

[38]

Gekhman Z，Yona G，Aharoni R，et al. Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?[C]. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing，2024: 7765-7784.

[39]

Bhattacharya P，Prasad V K，Verma A，et al. Demystifying ChatGPT：An in-depth survey of OpenAI’s robust large language models[J]. Archives of Computational Methods in Engineering，2024：1-44.

[40]

Wang C，Sennrich R. On Exposure Bias， Hallucination and Domain Shift in Neural Machine Translation[C]. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020：3544-3552.

[41]

Zhang M，Press O，Merrill W，et al. How Language Model Hallucinations Can Snowball[C]. International Conference on Machine Learning，2024: 59670-59684.

[42]

Yang Y，Chern E，Qiu X，et al. Alignment for honesty[J]. Advances in Neural Information Processing Systems，2024，37：63565-63598.

[43]

Cotra， Ajeya. Why AI alignment could be hard with modern deep learning [EB/OL]. （2025-09-21）[2025-04-25]. Cold Takes. https://www.cold-takes.com/whyai-alignment-could-be-hard-with-modern-deeplearning/.

[44]

Fan A，Lewis M，Dauphin Y. Hierarchical Neural Story Generation[C]. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics，2018，1: 889-898.

[45]

Alves D，Guerreiro N，Alves J，et al. Steering Large Language Models for Machine Translation with Finetuning and In-Context Learning[C]. Findings of the Association for Computational Linguistics：EMNLP 2023，2023: 11127-11148.

[46]

Yang Z，Dai Z，Salakhutdinov R，et al. Breaking the Softmax Bottleneck：A High-Rank RNN LanguageModel[C]. International Conference on Learning Representations，2018：1-18.

[47]

Yuan Y，Wang W，Guo Q，et al. Does chatgpt know that it does not know? evaluating the black-box calibration of chatgpt[C]. Proceedings of the 2024 Joint International Conference on Computational Linguistics， Language Resources and Evaluation （LREC-COLING 2024），2024：5191-5201.

[48]

Tihanyi N， Bisztray T， Ferrag M A， et al. How secure is AI-generated code: a large-scale comparison of large language models[J]. Empirical Software Engineering， 2025， 30（2）: 1-42.

[49]

全会 . 冲击·融合·协同：ChatGPT 对传媒业的影响刍议 [J]. 中国广播电视学刊，2023，（09）：17-21.

[50]

Nicola J. AI hallucinations can’t be stopped—but these techniques can limit their damage[J].Nature. 2025，637（8047）：778-780.

[51]

Katzenbach C，Pentzold C，Otero P V. Smoothing out smart tech’s rough edges: Imperfect automation and the human fix[J]. Human-Machine Communication，2024，7：23-44.

[52]

郭全中，苏刘润薇，彭子滔 . 2023—2024 年传媒业大模型应用报告 [J]. 中国传媒科技，2025，（1）：6-10.

[53]

李子甜 . 工具性收益与系统性风险：新闻从业者的人工智能新闻技术认知 [J]. 新闻大学，2022（11）：29-42+117.

[54]

Lee M. A mathematical investigation of hallucination and creativity in GPT models[J]. Mathematics，2023，11（10）：2320.

[55]

Huang L，Yu W，Ma W，et al. A survey on hallucination in large language models：Principles，taxonomy，challenges，and open questions[J]. ACM Transactions on Information Systems，2025，43（2）：1-55.

doi: 10.19483/j.cnki.11-4653/n.2025.05.001

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Editorial Information

Download

Links

doi: 10.19483/j.cnki.11-4653/n.2025.05.001

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Editorial Information

Download

Links

Export File

Citation

Format

Content