About Me
I am a Master's student in Computer Science and Technology at Tsinghua University, advised by Prof. Hai-Tao Zheng. My research lies in Natural Language Processing, with a particular focus on Large Language Models and Context Compression. I am currently a Research Intern at Future Living Lab of Alibaba Group (Foundation Model Team), where I work on efficient large language models. I am also seeking PhD positions for Fall 2027.
Research Interests: Natural Language Processing · Large Language Models · Context Compression
Education
M.Eng., Computer Science and Technology
🏆 Department-level Second-class Scholarship
B.Eng., Computer Science and Technology
🏆 National Scholarship · Outstanding Graduate · Outstanding Thesis · Excellent Student × 2 · First-class Scholarship × 2
Experience
Research Intern — Alibaba Group
- Working on context compression for large language models at the foundation model team
- Published first-author paper at ICLR 2026 (247K+ views)
- Published first-author paper at ACL 2026 (217K+ views)
- One paper under review at ICML 2026 (Score 4, 4, 3)
Skills
Programming: Python, PyTorch, LaTeX, Linux, SSH
Language: Chinese (Native), English (CET-6)
Main Publications (* denotes equal contribution)
GMSA: Enhancing Context Compression via Group Merging and Layer Semantic Alignment
ACL 2026 (CCF-A), SAC-recommendation Oral, Best Paper Nomination
Featured in the 400+ stars GitHub repository
Awesome-Collection-Token-Reduction
.
COMI: Coarse-to-Fine Context Compression via Marginal Information Gain
ICLR 2026 (CCF-A, Score: 8 6 6 6, 247K+ views)
Featured in the 400+ stars GitHub repository
Awesome-Collection-Token-Reduction
.
Read As Human: Compressing Context via Parallelizable Close Reading and Skimming
ACL 2026 (CCF-A, 217K+ views)
Featured in the 400+ stars GitHub repository
Awesome-Collection-Token-Reduction
.
Perception Compressor: A Training-Free Prompt Compression Framework in Long Context Scenarios
NAACL 2025 Findings (CCF-B)
Featured in the 400+ stars GitHub repository
Awesome-Collection-Token-Reduction
.
Data Distribution Matters: A Data-Centric Perspective on Context Compression for Large Language Models
ICML 2026 Under Review (CCF-A), Score: 4, 4, 3
When Hard Negatives Hurt: Bridging the Generative-Discriminative Gap in Hard Negative Synthesis
KDD 2026 Under Review (CCF-A), Score: avg. T:3 and avg. N:3