Yangsibo Huang

Yangsibo Huang

I am a research scientist at Google DeepMind (NYC). I am a core contributor of Gemini Thinking / Reasoning team, including Gemini 2.5 Pro and 2.5 Flash models. I am also a key contributor in pushing Gemini’s capablities to win a gold medal in IMO 2025.

Before that, I was a Ph.D. student and Wallace Memorial Fellow at Princeton University, co-advised by Prof. Kai Li and Prof. Sanjeev Arora. I also worked closely with Prof. Danqi Chen.


✰ Awards


✎ Selected Publications

Please refer to full publications or my Google Scholar profile for the full list.


  1. Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
    Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, and others
  2. Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
    Yangsibo Huang, Milad Nasr, Anastasios Angelopoulos, Nicholas Carlini, Wei-Lin Chiang, Christopher A Choquette-Choo, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Ken Ziyu Liu, and others
  3. MUSE: Machine Unlearning Six-Way Evaluation for Language Models
    Weijia Shi*, Jaechan Lee*, Yangsibo Huang*, Sadhika Malladi, Jieyu Zhao, Ari Holtzman, Daogao Liu, Luke Zettlemoyer, Noah A Smith, and Chiyuan Zhang
  4. SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors
    Tinghao Xie*, Xiangyu Qi*, Yi Zeng*, Yangsibo Huang*, Udari Madhushani Sehwag, Kaixuan Huang, Luxi He, Boyi Wei, Dacheng Li, Ying Sheng, Ruoxi Jia, Bo Li, Kai Li, Danqi Chen, Peter Henderson, and Prateek Mittal
  1. On Memorization of Large Language Models in Logical Reasoning
    Chulin Xie, Yangsibo Huang, Chiyuan Zhang, Da Yu, Xinyun Chen, Bill Yuchen Lin, Bo Li, Badih Ghazi, and Ravi Kumar
  2. An Adversarial Perspective on Machine Unlearning for AI Safety
    Jakub Łucki, Boyi Wei, Yangsibo Huang, Peter Henderson, Florian Tramèr, and Javier Rando
  3. A Safe Harbor for AI Evaluation and Red Teaming
    Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, and Peter Henderson
  4. Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
    Boyi Wei*, Kaixuan Huang*, Yangsibo Huang*, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, and Peter Henderson
  5. Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
    Yangsibo Huang, Samyak Gupta, Mengzhou Xia, Kai Li, and Danqi Chen
  6. Detecting Pretraining Data from Large Language Models
    Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, and Luke Zettlemoyer
      1. Evaluating Gradient Inversion Attacks and Defenses in Federated Learning
        Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, and Sanjeev Arora


          ㋡ Experiences


          ♥ Service


          ꐕ MISC

          rss facebook twitter github gitlab youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora