Zhiyuan Li (李志远)

prof_pic.jpg

Office: TTIC 508

I am a tenure-track assistant professor at Toyota Technological Institute at Chicago (TTIC) and an affiliated faculty of Computer Science at the University of Chicago. I was also a visiting faculty member at Google Research until April 2026. Before joining TTIC, I was a postdoctoral fellow in Computer Science Department at Stanford University, working with Tengyu Ma. I received my PhD from the Computer Science Department at Princeton University in 2022, where I was advised by Sanjeev Arora. I did my undergraduate study at Yao Class, Tsinghua University.

I am broadly interested in machine learning theory, including optimization in deep learning, reasoning capabilities of Large Language Models (LLMs), modern paradigm of generalization in machine learning (overparameterization, out-of-domain generalization) and its connection to the implicit bias of optimization algorithms.

News

Jun 05, 2026 Gave a talk on recursive models at the Simons Institute Multi-Program AI Reunion.
May 19, 2026 Serving as Area Chair for NeurIPS 2026.
May 13, 2026 Received a $600K NSF CAREER Award for work on architecture-aware optimization theory for deep learning.
May 01, 2026 Gave a Research at TTIC talk on optimizer geometry in modern deep learning.
Apr 30, 2026 Recursive Models for Long-Horizon Reasoning accepted at ICML 2026!

Selected and Recent publications

  1. Recursive Models for Long-Horizon Reasoning
    Chenxiao Yang, Nathan Srebro, and Zhiyuan Li
    In Proceedings of the 43rd International Conference on Machine Learning, ICML 2026
  2. On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond
    Chenxiao Yang, Cai Zhou, David Wipf, and Zhiyuan Li
    In The Fourteenth International Conference on Learning Representations, ICLR 2026
  3. A Tale of Two Geometries: Adaptive Optimizers and Non-Euclidean Descent
    Shuo Xie, Tianhao Wang, Beining Wu, and Zhiyuan Li
    In The Fourteenth International Conference on Learning Representations, ICLR 2026
  4. Structured Preconditioners in Adaptive Optimization: A Unified Analysis
    Shuo Xie, Tianhao Wang, Sashank Reddi, Sanjiv Kumar, and Zhiyuan Li
    In Proceedings of the 42nd International Conference on Machine Learning, ICML 2025
  5. PENCIL: Long Thoughts with Short Memory
    Chenxiao Yang, Nathan Srebro, David McAllester, and Zhiyuan Li
    In Proceedings of the 42nd International Conference on Machine Learning, ICML 2025
  6. Non-Asymptotic Length Generalization
    Thomas Chen, Tengyu Ma, and Zhiyuan Li
    In Proceedings of the 42nd International Conference on Machine Learning, ICML 2025
  7. Adam Exploits $\ell_\infty$-geometry of Loss Landscape via Coordinate-wise Adaptivity
    Shuo Xie, Mohamad Amin Mohamadi, and Zhiyuan Li
    In The Thirteenth International Conference on Learning Representations, ICLR 2025
  8. Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
    Zhiyuan Li, Hong Liu, Denny Zhou, and Tengyu Ma
    In The Twelfth International Conference on Learning Representations, ICLR 2024
  9. How Does Sharpness-Aware Minimization Minimize Sharpness?
    Kaiyue Wen, Tengyu Ma, and Zhiyuan Li
    In The Eleventh International Conference on Learning Representations, ICLR 2023
  10. What Happens after SGD Reaches Zero Loss?--A Mathematical Framework
    Zhiyuan Li, Tianhao Wang, and Sanjeev Arora
    In The Tenth International Conference on Learning Representations, ICLR 2022
  11. Why Are Convolutional Nets More Sample-Efficient Than Fully-Connected Nets?
    Zhiyuan Li, Yi Zhang, and Sanjeev Arora
    In The Ninth International Conference on Learning Representations, ICLR 2021