Home | Yangsibo Huang

Yangsibo Huang

I am a research scientist at Google DeepMind (NYC). I am a core contributor of Gemini Thinking / Reasoning team, including Gemini 2.5 Pro and 2.5 Flash models. I am also a key contributor in pushing Gemini’s capablities to win a gold medal in IMO 2025.

Before that, I was a Ph.D. student and Wallace Memorial Fellow at Princeton University, co-advised by Prof. Kai Li and Prof. Sanjeev Arora. I also worked closely with Prof. Danqi Chen.

✰ Awards

🏆 The best paper award at NeurIPS’24-SoLaR
🏆 The best paper award at ICLR’24-SeT LLM
🏅 Rising Star in EECS in 2023
🏅 Wallace Memorial Fellowship (2023-2024)

✎ Selected Publications

Please refer to full publications or my Google Scholar profile for the full list.

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, and others

📍 Technical report 2025

Paper
Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards

Yangsibo Huang, Milad Nasr, Anastasios Angelopoulos, Nicholas Carlini, Wei-Lin Chiang, Christopher A Choquette-Choo, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Ken Ziyu Liu, and others

📍 ICML 2025 (Oral)

Paper
MUSE: Machine Unlearning Six-Way Evaluation for Language Models

Weijia Shi*, Jaechan Lee*, Yangsibo Huang*, Sadhika Malladi, Jieyu Zhao, Ari Holtzman, Daogao Liu, Luke Zettlemoyer, Noah A Smith, and Chiyuan Zhang

📍 ICLR 2025

Paper Code Website Dataset Leaderboard
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors

Tinghao Xie*, Xiangyu Qi*, Yi Zeng*, Yangsibo Huang*, Udari Madhushani Sehwag, Kaixuan Huang, Luxi He, Boyi Wei, Dacheng Li, Ying Sheng, Ruoxi Jia, Bo Li, Kai Li, Danqi Chen, Peter Henderson, and Prateek Mittal

📍 ICLR 2025

Paper Code Website Dataset Leaderboard

On Memorization of Large Language Models in Logical Reasoning

Chulin Xie, Yangsibo Huang, Chiyuan Zhang, Da Yu, Xinyun Chen, Bill Yuchen Lin, Bo Li, Badih Ghazi, and Ravi Kumar

📍 Preprint 2024

Paper Code Website Dataset
An Adversarial Perspective on Machine Unlearning for AI Safety

Jakub Łucki, Boyi Wei, Yangsibo Huang, Peter Henderson, Florian Tramèr, and Javier Rando

📍 NeurIPS SoLaR 2024 (Best Paper)

Paper Code
A Safe Harbor for AI Evaluation and Red Teaming

Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, and Peter Henderson

📍 ICML 2024 (Oral)

Paper Website
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

Boyi Wei*, Kaixuan Huang*, Yangsibo Huang*, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, and Peter Henderson

📍 ICML 2024 & ICLR Secure and Trustworthy LLMs 2024 (Best Paper)

Paper Code Website
Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation

Yangsibo Huang, Samyak Gupta, Mengzhou Xia, Kai Li, and Danqi Chen

📍 ICLR 2024 (Spotlight)

Paper Code Poster Website Dataset
Detecting Pretraining Data from Large Language Models

Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, and Luke Zettlemoyer

📍 ICLR 2024 (Oral Presentation at RegML@NeurIPS’23)

Paper Code Website Dataset

Evaluating Gradient Inversion Attacks and Defenses in Federated Learning

Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, and Sanjeev Arora

📍 NeurIPS 2021 (Oral)

Paper Code Poster

㋡ Experiences

[10/22-12/23] Google AI, Research Intern, hosted by Chiyuan Zhang and Badih Ghazi
[05/22-09/22] Meta AI, Research Intern, hosted by Seyi Feyisetan and Alexandre Sablayrolles
[09/18-04/19] Harvard Medical School, Research Intern, supervised by Prof. Quanzheng Li
[06/18-08/18] David Geffen School of Medicine at UCLA, Research Intern, supervised by Prof. Ke Sheng

♥ Service

Organizer of Large Language Model Memorization Workshop (L2M2) at ACL 2025
Area Chair for
- Theoretical Foundations of Foundation Models (co-located with ICML 2024)
Program Committee member for
- Generative AI + Law (GenLaw) ’24 (co-located with ICML 2024)
- Privacy Regulation and Protection in Machine Learning (co-located with ICLR 2024)
- Federated Learning and Analytics in Practice (co-located with ICML 2023)
- Interpretable Machine Learning in Healthcare (co-located with ICML 2021, 2022)
- Computer Vision for Automated Medical Diagnosis (co-located with ICCV 2021)
Reviewer for ICML (2022, 2023, 2024), NeurIPS (2021, 2022, 2023), ICLR (2024), COLM (2024)

ꐕ MISC

In my spare time, I mainly stay with my four cats 😺😻😼😽.
I enjoy reading books about psychology.