I am a research scientist at Google DeepMind (NYC). I am a core contributor of Gemini Thinking / Reasoning team, including Gemini 2.5 Pro and 2.5 Flash models. I am also a key contributor in pushing Gemini’s capablities to win a gold medal in IMO 2025.
Before that, I was a Ph.D. student and Wallace Memorial Fellow at Princeton University, co-advised by Prof. Kai Li and Prof. Sanjeev Arora. I also worked closely with Prof. Danqi Chen.
✰ Awards
- 🏆 The best paper award at NeurIPS’24-SoLaR
- 🏆 The best paper award at ICLR’24-SeT LLM
- 🏅 Rising Star in EECS in 2023
- 🏅 Wallace Memorial Fellowship (2023-2024)
✎ Selected Publications
Please refer to full publications or my Google Scholar profile for the full list.
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Gheorghe Comanici,
Eric Bieber,
Mike Schaekermann,
Ice Pasupat,
Noveen Sachdeva,
Inderjit Dhillon,
Marcel Blistein,
Ori Ram,
Dan Zhang,
Evan Rosen,
and others
📍
Technical report
2025
Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Yangsibo Huang,
Milad Nasr,
Anastasios Angelopoulos,
Nicholas Carlini,
Wei-Lin Chiang,
Christopher A Choquette-Choo,
Daphne Ippolito,
Matthew Jagielski,
Katherine Lee,
Ken Ziyu Liu,
and others
📍
ICML
2025
(Oral)
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
Weijia Shi*,
Jaechan Lee*,
Yangsibo Huang*,
Sadhika Malladi,
Jieyu Zhao,
Ari Holtzman,
Daogao Liu,
Luke Zettlemoyer,
Noah A Smith,
and Chiyuan Zhang
📍
ICLR
2025
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors
Tinghao Xie*,
Xiangyu Qi*,
Yi Zeng*,
Yangsibo Huang*,
Udari Madhushani Sehwag,
Kaixuan Huang,
Luxi He,
Boyi Wei,
Dacheng Li,
Ying Sheng,
Ruoxi Jia,
Bo Li,
Kai Li,
Danqi Chen,
Peter Henderson,
and Prateek Mittal
📍
ICLR
2025
On Memorization of Large Language Models in Logical Reasoning
Chulin Xie,
Yangsibo Huang,
Chiyuan Zhang,
Da Yu,
Xinyun Chen,
Bill Yuchen Lin,
Bo Li,
Badih Ghazi,
and Ravi Kumar
📍
Preprint
2024
An Adversarial Perspective on Machine Unlearning for AI Safety
Jakub Łucki,
Boyi Wei,
Yangsibo Huang,
Peter Henderson,
Florian Tramèr,
and Javier Rando
📍
NeurIPS SoLaR
2024
(Best Paper)
A Safe Harbor for AI Evaluation and Red Teaming
Shayne Longpre,
Sayash Kapoor,
Kevin Klyman,
Ashwin Ramaswami,
Rishi Bommasani,
Borhane Blili-Hamelin,
Yangsibo Huang,
Aviya Skowron,
Zheng-Xin Yong,
Suhas Kotha,
Yi Zeng,
Weiyan Shi,
Xianjun Yang,
Reid Southen,
Alexander Robey,
Patrick Chao,
Diyi Yang,
Ruoxi Jia,
Daniel Kang,
Sandy Pentland,
Arvind Narayanan,
Percy Liang,
and Peter Henderson
📍
ICML
2024
(Oral)
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Boyi Wei*,
Kaixuan Huang*,
Yangsibo Huang*,
Tinghao Xie,
Xiangyu Qi,
Mengzhou Xia,
Prateek Mittal,
Mengdi Wang,
and Peter Henderson
📍
ICML 2024 & ICLR Secure and Trustworthy LLMs
2024
(Best Paper)
Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Yangsibo Huang,
Samyak Gupta,
Mengzhou Xia,
Kai Li,
and Danqi Chen
📍
ICLR
2024
(Spotlight)
Detecting Pretraining Data from Large Language Models
Weijia Shi,
Anirudh Ajith,
Mengzhou Xia,
Yangsibo Huang,
Daogao Liu,
Terra Blevins,
Danqi Chen,
and Luke Zettlemoyer
📍
ICLR
2024
(Oral Presentation at RegML@NeurIPS’23)
Evaluating Gradient Inversion Attacks and Defenses in Federated Learning
Yangsibo Huang,
Samyak Gupta,
Zhao Song,
Kai Li,
and Sanjeev Arora
📍
NeurIPS
2021
(Oral)
㋡ Experiences
♥ Service
ꐕ MISC
- In my spare time, I mainly stay with my four cats 😺😻😼😽.
- I enjoy reading books about psychology.