Publications

Breaking the Reversal Curse in Autoregressive Language Models via Identity Bridge

Xutao Ma*, Yixiao Huang*, Hanlin Zhu, Somayeh Sojoudi

International Conference on Machine Learning (ICML), 2026 (Spotlight)

Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

Hanlin Zhu, Shibo Hao, Zhiting Hu, Jiantao Jiao, Stuart Russell, Yuandong Tian

International Conference on Learning Representations (ICLR), 2026

Auditing Black-Box LLM APIs with a Rank-Based Uniformity Test

Xiaoyuan Zhu, Yaowen Ye*, Tianyi Qiu*, Hanlin Zhu^†, Sijun Tan^†, Ajraf Mannan, Jonathan Michala, Raluca Ada Popa, Willie Neiswanger

International Conference on Learning Representations (ICLR), 2026

GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments

Hanlin Zhu*, Tianyu Guo*, Song Mei, Stuart Russell, Nikhil Ghosh, Alberto Bietti, Jiantao Jiao

preprint, 2025

How Do LLMs Perform Two-Hop Reasoning in Context?

Tianyu Guo*, Hanlin Zhu*, Ruiqi Zhang, Jiantao Jiao, Song Mei, Michael I. Jordan, Stuart Russell

preprint, 2025

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought [slides]

Hanlin Zhu*, Shibo Hao*, Zhiting Hu, Jiantao Jiao, Stuart Russell, Yuandong Tian

Conference on Neural Information Processing Systems (NeurIPS), 2025

Methods and Opportunities at Small Scale (MOSS) Workshop, ICML 2025 (Oral)

Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers

Yixiao Huang*, Hanlin Zhu*, Tianyu Guo*, Jiantao Jiao, Somayeh Sojoudi, Michael I. Jordan, Stuart Russell, Song Mei

Conference on Neural Information Processing Systems (NeurIPS), 2025

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

DiJia Su, Hanlin Zhu*, Yingchen Xu*, Jiantao Jiao, Yuandong Tian^†, Qinqing Zheng^†

International Conference on Machine Learning (ICML), 2025

Avoiding Catastrophe in Online Learning by Asking for Help

Benjamin Plaut, Hanlin Zhu, Stuart Russell

International Conference on Machine Learning (ICML), 2025

Towards a Theoretical Understanding of the ‘Reversal Curse’ via Training Dynamics

Hanlin Zhu*, Baihe Huang*, Shaolun Zhang, Michael Jordan, Jiantao Jiao, Yuandong Tian, Stuart Russell

Conference on Neural Information Processing Systems (NeurIPS), 2024

Learning Personalized Alignment for Evaluating Open-ended Text Generation

Danqing Wang, Kevin Yang, Hanlin Zhu, Xiaomeng Yang, Andrew Cohen, Lei Li, Yuandong Tian

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Starling-7B: Improving Helpfulness and Harmlessness with RLAIF

Banghua Zhu*, Evan Frick*, Tianhao Wu*, Hanlin Zhu, Karthik Ganesan, Wei-Lin Chiang, Jian Zhang, Jiantao Jiao

Conference on Language Modeling (COLM), 2024 (Oral)

Towards Optimal Statistical Watermarking

Baihe Huang, Hanlin Zhu, Banghua Zhu, Kannan Ramchandran, Michael Jordan, Jason Lee, Jiantao Jiao

preprint, 2024

On Representation Complexity of Model-based and Model-free Reinforcement Learning

Hanlin Zhu*, Baihe Huang*, Stuart Russell

International Conference on Learning Representations (ICLR), 2024

Philosophical Transactions of the Royal Society A, special issue: World Models, A(G)I, and the Hard Problem(s) of Life–Mind Continuity (in press)

Efficient Prompt Caching via Embedding Similarity

Hanlin Zhu, Banghua Zhu, Jiantao Jiao

Machine Learning for Systems Workshop, NeurIPS 2023

End-to-end Story Plot Generator

Hanlin Zhu*, Andrew Cohen*, Danqing Wang, Kevin Yang, Xiaomeng Yang, Jiantao Jiao, Yuandong Tian

preprint, 2023

Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy Concentrability

Hanlin Zhu, Amy Zhang

Conference on Neural Information Processing Systems (NeurIPS), 2023

Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning

Hanlin Zhu, Paria Rashidinejad, Jiantao Jiao

Conference on Neural Information Processing Systems (NeurIPS), 2023

Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian

Paria Rashidinejad, Hanlin Zhu, Kunhe Yang, Stuart Russell, Jiantao Jiao

International Conference on Learning Representations (ICLR), 2023 (Spotlight)

Provably Efficient Reinforcement Learning via Surprise Bound

Hanlin Zhu, Ruosong Wang, Jason Lee

Artificial Intelligence and Statistics (AISTATS), 2023

Average-Case Communication Complexity of Statistical Problems

Cyrus Rashtchian, David P. Woodruff, Peng Ye, Hanlin Zhu ($\alpha$-$\beta$ order)

Conference on Learning Theory (COLT), 2021

Vector-Matrix-Vector Queries for Solving Linear Algebra, Statistics, and Graph Problems

Cyrus Rashtchian, David P. Woodruff, Hanlin Zhu ($\alpha$-$\beta$ order)

RANDOM, 2020

Guided Dialog Policy Learning: Reward Estimation for Multi-Domain Task-Oriented Dialog

Ryuichi Takanobu, Hanlin Zhu, Minlie Huang

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019 (Oral)