About me

I am currently a Ph.D. student at the Center for Cognitive Machines and Computational Health (CMaCH), School of Computer Science, Shanghai Jiao Tong University (SJTU), advised by Prof. Shikui Tu and Prof. Lei Xu. I am also fortunate to be co-supervised by Prof. Biwei Huang at the Halıcıoğlu Data Science Institute (HDSI), University of California San Diego (UCSD).

My research focuses on LLM/RL post-training and retrieval-augmented generation (RAG), with a particular interest in improving model alignment and reliability through causal reward modeling. More broadly, I study how causal structure can enhance the robustness, generalization, and decision reliability of reinforcement learning and foundation model systems. Before joining SJTU, I received my B.Sc. in Artificial Intelligence from Chien-Shiung Wu College, Southeast University (SEU), where I worked with Prof. Shaofu Yang from 2018 to 2022.


News

  • 2026/02   One paper submitted to KDD 2026.
  • 2026/01   One paper submitted to ICML 2026.
  • 2025/10   One paper submitted to TACL.
  • 2025/07   Presented on Real-World LLM Applications at Air Force Research Institute (AFRI).
  • 2025/06   Joined Alibaba Group as a Research Intern working on LLM post-training and multimodal RAG systems.
  • 2024/10   One paper submitted to ICLR 2025 (accepted). 🚀🚀
  • 2024/08   Collaborated with Huawei Cloud on a GraphRAG project.
  • 2024/01   One paper submitted to IJCAI 2024 (accepted as oral presentation). 🚀

Research Interests

My research lies at the intersection of LLM Post-Training, Reinforcement Learning, and Causal AI. I aim to build reliable, generalizable, and hallucination-resistant AI systems by integrating causal structures into learning paradigms.

My current interests include:

  • LLM/RL Post-Training & Alignment: Developing robust Reward Modeling techniques via causal decomposition to mitigate reward hacking and improve alignment stability in RLHF.
  • Causality-Guided Representation Learning: Leveraging causal discovery to enhance exploration efficiency, domain adaptation, and policy generalization in complex environments.
  • Retrieval-Augmented Generation (RAG): Building next-generation Multimodal RAG and GraphRAG systems that utilize causal reasoning to reduce hallucinations in information extraction and cross-modal retrieval.