About Me

I am currently a Ph.D. student in Computer Science and Technology at the School of Computer and Information Technology, Shanxi University, under the supervision of Prof. Ru Li and Prof. Víctor Gutiérrez Basulto.

My primary research interests include Natural Language Processing and Frame Semantics. I am currently focused on the knowledge memorization capabilities of Large Language Models, with particular emphasis on their ability to understand and model semantic scenarios.

📖 Educations

2024.09 - Present, School of Computer and Information Technology, Shanxi University. Ph.D. Student.
2022.09 - 2024.06, School of Computer and Information Technology, Shanxi University. Master Student.
2018.09 - 2022.06, School of Information Science and Technology, Taiyuan University of Science and Technology. Undergraduate.

📝 Selected Publications

Memorization ≠ Understanding: Do Large Language Models Have the Ability of Scenario Cognition?

Published in Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Driven by vast and diverse textual data, large language models (LLMs) have demonstrated impressive performance across numerous natural language processing (NLP) tasks. Yet, a critical question persists: does their generalization arise from mere memorization of training data or from deep semantic understanding? To investigate this, we propose a bi-perspective evaluation framework to assess LLMs’ scenario cognition—the ability to link semantic scenario elements with their arguments in context. Specifically, we introduce a novel scenario-based dataset comprising diverse textual descriptions of fictional facts, annotated with scenario elements. LLMs are evaluated through their capacity to answer scenario-related questions (model output perspective) and via probing their internal representations for encoded scenario elements-argument associations (internal representation perspective). Our experiments reveal that current LLMs predominantly rely on superficial memorization, failing to achieve robust semantic scenario cognition, even in simple cases. These findings expose critical limitations in LLMs’ semantic understanding and offer cognitive insights for advancing their capabilities.

Recommended citation: Ma B, Li R, Yuanlong W, et al. Memorization≠ Understanding: Do Large Language Models Have the Ability of Scenario Cognition?[C]//Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025: 20758-20774.
Download Paper

💬 Talks

Oral - EMNLP 2025

Published: November 06, 2025

Oral for the accepted paper: Memorization ≠ Understanding: Do Large Language Models Have the Ability of Scenario Cognition?

Boxiang Ma

📖 Educations

📝 Selected Publications

Memorization ≠ Understanding: Do Large Language Models Have the Ability of Scenario Cognition?

💬 Talks

Oral - EMNLP 2025