About me

Hi, there! I’m Zhengyang Geng, a Ph.D. student advised by Zico Kolter. I am fortunate to work closely with Kaiming He. Previously, I was a research assistant advised by Zhouchen Lin. I also had a chance to work with Shaojie Bai.

I am an enthusiast of dynamics, recognizing, understanding, and developing dynamics that lead to non-trivial systems.

Research

I pursue a principled—and, yes, playful (乐子)—understanding of intelligence. My interests are eclectic, but they converge on dynamics as the unifying language. I believe that structured decomposition (perception) and reconstruction (generation) are key to the emergence of general intelligence, and that dynamics provide an elegant mechanism.

  • Regarding the “forward” pass, I study dynamical systems as the construction method and learning principle in neural networks.
  • Regarding the “backward” pass, I investigate training dynamics, their geometry, landscape, and the couplings among data/env, model, and optimization.

Beyond artificial systems, I’m interested in modeling and understanding nature through dynamics. The only constant is change; invariance under change is truth.

Twitter

Projects

  • Mean Flows for One-step Generative Modeling
    Zhengyang Geng, Mingyang Deng, Xingjian Bai, J. Zico Kolter, and Kaiming He
    In Neural Information Processing Systems (NeurIPS) 2025, Oral.
    TL;DR: Learning to solve generative dynamics at training time.
    Key Words: Identity, Fixed Points, Differentiation (verification)-Integration (generation) Gap
    [Paper] [JAX Code] [Pytorch Code]

  • Consistency Models Made Easy
    Zhengyang Geng, William Luo, Ashwini Pokle, and J. Zico Kolter
    In International Conference on Learning Representations (ICLR) 2025
    TL;DR: Easy Consistency Tuning through Self Teacher
    [Paper][Blog] [Code] [BibTex]

  • TorchDEQ Logo

    TorchDEQ: A Library for Deep Equilibrium Models

    Zhengyang Geng, and J. Zico Kolter

    TL;DR: Modern Fixed Point Systems using Pytorch.
    [Report] [Code] [Colab Tutorial] [Doc] [DEQ Zoo]

  • 1-Step Diffusion Distillation via Deep Equilibrium Models
    In Neural Information Processing Systems (NeurIPS) 2023
    Zhengyang Geng*, Ashwini Pokle*, and J. Zico Kolter
    TL;DR: Generative Equilibrium Transformer (GET) as strong 1-step diffusion learner.
    [PDF] [Code] [BibTex]

  • Medusa Logo

    Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

    Tianle Cai*, Yuhong Li*, Zhengyang Geng, Hongwu Peng, Tri Dao

    In International Conference on Machine Learning (ICML) 2024
    TL;DR: Simple LLM acceleration with multiple decoding heads and self-verification.
    [Report] [Blog] [Code]

  • Equilibrium Image Denoising With Implicit Differentiation
    In IEEE Transactions on Image Processing
    Qi Chen, Yifei Wang, Zhengyang Geng, Yisen Wang, Jiansheng Yang, and Zhouchen Lin
    TL;DR: Equilibrium image denoising with implicit differentiation.
    [Paper] [BibTex]

  • Deep Equilibrium Approaches To Diffusion Models
    In Neural Information Processing Systems (NeurIPS) 2022
    Ashwini Pokle, Zhengyang Geng, and J. Zico Kolter
    TL;DR: Parallel diffusion decoding via fixed point equations.
    [Paper] [Code] [BibTex]

  • Eliminating Gradient Conflict in Reference-based Line-art Colorization
    Zekun Li, Zhengyang Geng, Zhao Kang, Wenyu Chen, Yibo Yang
    In Proceedings of European Conference on Computer Vision (ECCV) 2022
    TL;DR: Investigating and alleviating gradient conflicts in attention training.
    [Paper] [Code] [BibTex]

  • Deep Equilibrium Optical Flow Estimation
    Shaojie Bai*, Zhengyang Geng*, Yash Savani, J. Zico Kolter (*equal contribution)
    In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022
    TL;DR: Harder problems. More compute. Better convergence & performance.
    [PDF] [Code] [BibTex]

  • On Training Implicit Models
    Zhengyang Geng*, Xin-Yu Zhang*, Shaojie Bai, Yisen Wang, Zhouchen Lin (*equal contribution)
    In Neural Information Processing Systems (NeurIPS) 2021
    TL;DR: Cheap, fast, and stable inexact gradient works as well as implicit differentiation.
    [Paper] [Slides] [Poster] [Code] [BibTex]

  • Residual Relaxation for Multi-view Representation Learning
    Yifei Wang, Zhengyang Geng, Feng Jiang, Chuming Li, Yisen Wang, Jiansheng Yang, Zhouchen Lin.
    In Neural Information Processing Systems (NeurIPS) 2021
    TL;DR: Equivariant contrastive learning replaces invariant contrastive learning.
    [Paper] [Slides] [BibTex]

  • Is Attention Better Than Matrix Decomposition?
    Zhengyang Geng*, Meng-Hao Guo*, Hongxu Chen, Xia Li, Ke Wei, Zhouchen Lin. (*equal contribution)
    In International Conference on Learning Representations (ICLR) 2021, top 3%.
    TL;DR: Optimization (matrix decomposition) as attention.
    [Paper] [Code] [Blog Series 1 (zh), 2 (zh), 3 (zh)] [Poster] [BibTex]