About me
Hi, there! I’m Zhengyang Geng, a Ph.D. student advised by Zico Kolter. I am fortunate to work closely with Kaiming He. Previously, I was a research assistant advised by Zhouchen Lin. I also had a chance to work with Shaojie Bai.
I am an enthusiast of dynamics, recognizing, understanding, and developing dynamics that lead to non-trivial systems.
Research
I pursue a principled—and, yes, playful (乐子)—understanding of intelligence. My interests are eclectic, but they converge on dynamics as the unifying language. I believe that structured decomposition (perception) and reconstruction (generation) are key to the emergence of general intelligence, and that dynamics provide an elegant mechanism.
- Regarding the “forward” pass, I study dynamical systems as the construction method and learning principle in neural networks.
- Regarding the “backward” pass, I investigate training dynamics, their geometry, landscape, and the couplings among data/env, model, and optimization.
Beyond artificial systems, I’m interested in modeling and understanding nature through dynamics. The only constant is change; invariance under change is truth.
There is no need to anneal down life's "learning rate" too early. Even today, I often feel I should "restart" to extricate myself from too many AI papers📝or bubbles🫧.
— Zhengyang Geng (@ZhengyangGeng) April 12, 2024
AGI should offer people better childhood and teenage lives, not grasping people to serve and achieve itself.… https://t.co/JCrRfL3boU
Projects
Mean Flows for One-step Generative Modeling
Zhengyang Geng, Mingyang Deng, Xingjian Bai, J. Zico Kolter, and Kaiming He
In Neural Information Processing Systems (NeurIPS) 2025, Oral.
TL;DR: Learning to solve generative dynamics at training time.
Key Words: Identity, Fixed Points, Differentiation (verification)-Integration (generation) Gap
[Paper] [JAX Code] [Pytorch Code]Consistency Models Made Easy
Zhengyang Geng, William Luo, Ashwini Pokle, and J. Zico Kolter
In International Conference on Learning Representations (ICLR) 2025
TL;DR: Easy Consistency Tuning through Self Teacher
[Paper][Blog] [Code] [BibTex]TorchDEQ: A Library for Deep Equilibrium Models
Zhengyang Geng, and J. Zico Kolter
TL;DR: Modern Fixed Point Systems using Pytorch.
[Report] [Code] [Colab Tutorial] [Doc] [DEQ Zoo]1-Step Diffusion Distillation via Deep Equilibrium Models
In Neural Information Processing Systems (NeurIPS) 2023
Zhengyang Geng*, Ashwini Pokle*, and J. Zico Kolter
TL;DR: Generative Equilibrium Transformer (GET) as strong 1-step diffusion learner.
[PDF] [Code] [BibTex]Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Tianle Cai*, Yuhong Li*, Zhengyang Geng, Hongwu Peng, Tri Dao
In International Conference on Machine Learning (ICML) 2024
TL;DR: Simple LLM acceleration with multiple decoding heads and self-verification.
[Report] [Blog] [Code]Equilibrium Image Denoising With Implicit Differentiation
In IEEE Transactions on Image Processing
Qi Chen, Yifei Wang, Zhengyang Geng, Yisen Wang, Jiansheng Yang, and Zhouchen Lin
TL;DR: Equilibrium image denoising with implicit differentiation.
[Paper] [BibTex]Deep Equilibrium Approaches To Diffusion Models
In Neural Information Processing Systems (NeurIPS) 2022
Ashwini Pokle, Zhengyang Geng, and J. Zico Kolter
TL;DR: Parallel diffusion decoding via fixed point equations.
[Paper] [Code] [BibTex]Eliminating Gradient Conflict in Reference-based Line-art Colorization
Zekun Li, Zhengyang Geng, Zhao Kang, Wenyu Chen, Yibo Yang
In Proceedings of European Conference on Computer Vision (ECCV) 2022
TL;DR: Investigating and alleviating gradient conflicts in attention training.
[Paper] [Code] [BibTex]Deep Equilibrium Optical Flow Estimation
Shaojie Bai*, Zhengyang Geng*, Yash Savani, J. Zico Kolter (*equal contribution)
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022
TL;DR: Harder problems. More compute. Better convergence & performance.
[PDF] [Code] [BibTex]On Training Implicit Models
Zhengyang Geng*, Xin-Yu Zhang*, Shaojie Bai, Yisen Wang, Zhouchen Lin (*equal contribution)
In Neural Information Processing Systems (NeurIPS) 2021
TL;DR: Cheap, fast, and stable inexact gradient works as well as implicit differentiation.
[Paper] [Slides] [Poster] [Code] [BibTex]Residual Relaxation for Multi-view Representation Learning
Yifei Wang, Zhengyang Geng, Feng Jiang, Chuming Li, Yisen Wang, Jiansheng Yang, Zhouchen Lin.
In Neural Information Processing Systems (NeurIPS) 2021
TL;DR: Equivariant contrastive learning replaces invariant contrastive learning.
[Paper] [Slides] [BibTex]- Is Attention Better Than Matrix Decomposition?
Zhengyang Geng*, Meng-Hao Guo*, Hongxu Chen, Xia Li, Ke Wei, Zhouchen Lin. (*equal contribution)
In International Conference on Learning Representations (ICLR) 2021, top 3%.
TL;DR: Optimization (matrix decomposition) as attention.
[Paper] [Code] [Blog Series 1 (zh), 2 (zh), 3 (zh)] [Poster] [BibTex]