About me

Hi there, I’m Zhengyang Geng, a final-year Ph.D. student advised by Zico Kolter and working closely with Kaiming He. Previously, I worked with Zhouchen Lin and Shaojie Bai.

I am an enthusiast of dynamics, recognizing, understanding, and developing dynamics that lead to non-trivial systems.

Research

I pursue a principled—and, yes, playful (乐子)—understanding of intelligence. My interests are eclectic, but they converge on dynamics as the unifying language. I believe that structured decomposition (representation) and reconstruction (generation) are key to the emergence of general intelligence, and that dynamics provide an elegant mechanism. Beyond artificial systems, I’m interested in modeling and understanding nature through dynamics. The only constant is change; invariance under change is truth.

Forward/Backward Perspective

Forward Pass

I study dynamical systems as both a construction method and a learning principle for neural networks.

Backward Pass

I investigate training dynamics: geometry, landscapes, and couplings among data/environment, model, and optimization.

Twitter

Selected Works

Generative Modelingtowards 1-step
  • One-step Latent-free Image Generation with Pixel Mean Flows Tech Report2026
    Yiyang Lu*, Susie Lu*, Qiao Sun*, Hanhong Zhao*, Zhicheng Jiang, Xianbang Wang, Tianhong Li, Zhengyang Geng, and Kaiming He
    TL;DR: One-step image generation directly in pixel space.
    [Paper] [Code]

  • Improved Mean Flows: On the Challenges of Fastforward Generative Models CVPR2026
    Zhengyang Geng*, Yiyang Lu*, Zongze Wu, Eli Shechtman, J. Zico Kolter, and Kaiming He
    TL;DR: Stability, Flexibility, and Architecture for Mean Flows.
    [Paper] [Code]

  • Mean Flows for One-step Generative Modeling NeurIPS2025 Oral
    Zhengyang Geng, Mingyang Deng, Xingjian Bai, J. Zico Kolter, and Kaiming He
    TL;DR: Learning to solve generative dynamics at training time.
    [Paper] [JAX Code] [PyTorch Code]

  • Consistency Models Made Easy ICLR2025
    Zhengyang Geng, William Luo, Ashwini Pokle, and J. Zico Kolter
    TL;DR: Easy Consistency Tuning through Self Teacher.
    [Paper] [Blog] [Code] [BibTeX]

  • 1-Step Diffusion Distillation via Deep Equilibrium Models NeurIPS2023
    Zhengyang Geng*, Ashwini Pokle*, and J. Zico Kolter
    TL;DR: Equilibrium Transformer + offline distillation for one-step diffusion.
    [Paper] [Code] [BibTeX]

  • Deep Equilibrium Approaches To Diffusion Models NeurIPS2022
    Ashwini Pokle, Zhengyang Geng, and J. Zico Kolter
    TL;DR: Parallel diffusion decoding via fixed-point equations.
    [Paper] [Code] [BibTeX]

  • Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models ICML2025
    Weijian Luo, Colin Zhang, Debing Zhang, and Zhengyang Geng
    TL;DR: Score-based preference alignment for one-step text-to-image models.
    [Paper] [Code]

  • One-Step Diffusion Distillation through Score Implicit Matching NeurIPS2024
    Weijian Luo, Zemin Huang, Zhengyang Geng, J. Zico Kolter, and Guo-jun Qi
    TL;DR: Data-free one-step diffusion distillation via score implicit matching.
    [Paper] [Code]

  • Medusa Logo Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads ICML2024
    Tianle Cai*, Yuhong Li*, Zhengyang Geng, Hongwu Peng, and Tri Dao
    TL;DR: Simple LLM acceleration with multiple decoding heads and self-verification.
    [Paper] [Blog] [Code]
Neural Attractors & Deep Equilibrium Modelstowards ∞-step
  • TorchDEQ Logo TorchDEQ: A Library for Deep Equilibrium Models Tech Report2023
    Zhengyang Geng and J. Zico Kolter
    TL;DR: Modern fixed-point systems in PyTorch.
    [Report] [Code] [Colab Tutorial] [Doc] [DEQ Zoo]

  • Deep Equilibrium Optical Flow Estimation CVPR2022
    Shaojie Bai*, Zhengyang Geng*, Yash Savani, and J. Zico Kolter
    TL;DR: Harder problems, more compute, better convergence and performance.
    [Paper] [Code] [BibTeX]

  • On Training Implicit Models NeurIPS2021
    Zhengyang Geng*, Xin-Yu Zhang*, Shaojie Bai, Yisen Wang, and Zhouchen Lin
    TL;DR: Inexact gradient training can be cheap, fast, and stable.
    [Paper] [Slides] [Poster] [Code] [BibTeX]

  • Is Attention Better Than Matrix Decomposition? ICLR2021 Top 3%
    Zhengyang Geng*, Meng-Hao Guo*, Hongxu Chen, Xia Li, Ke Wei, and Zhouchen Lin
    TL;DR: Optimization (matrix decomposition) as attention.
    [Paper] [Code] [Blog Series 1 (zh), 2 (zh), 3 (zh)] [Poster] [BibTeX]

Full List (Google Scholar)