About me

Hi! I’m a CS Ph.D. student at CMU advised by Zico Kolter. Previously, I was a research assistant at Peking University, advised by Zhouchen Lin. I also had a wonderful summer at Meta Reality Labs with Shaojie Bai working on generative avatar encoding.

I am an enthusiast of dynamics, recognizing, understanding, and developing dynamics that self-organize complex systems.

Research

I have eclectic interests in machine learning and deep learning, especially when combined with dynamics. I believe that structured decomposition (perception) and reconstruction (generation) are key to understanding the emergence of general intelligence, which dynamics can elegantly achieve.

Regarding the “forward” pass, I study dynamical systems (fixed point equations, optimization, differential equations, etc) as the construction method and learning principle in neural networks.
Regarding the “backward” pass, I pursue a better understanding of neural network training dynamics. I am fascinated by their landscape. A strong belief drives me to investigate the interaction between data/env, model, and learning dynamics.

I am also interested in modeling and understanding nature through dynamics. The only constant is change. The invariance under change is truth. :D

Twitter

There is no need to anneal down life's "learning rate" too early. Even today, I often feel I should "restart" to extricate myself from too many AI papers📝or bubbles🫧.

AGI should offer people better childhood and teenage lives, not grasping people to serve and achieve itself.… https://t.co/JCrRfL3boU
— Zhengyang Geng (@ZhengyangGeng) April 12, 2024

Projects

Consistency Models Made Easy
Zhengyang Geng, William Luo, Ashwini Pokle, and J. Zico Kolter
TL;DR: Easy Consistency Tuning through Self Teacher :D
[Blog] [Code] [BibTex]
TorchDEQ: A Library for Deep Equilibrium Models
Zhengyang Geng, and J. Zico Kolter
TL;DR: Modern Fixed Point Systems using Pytorch.
[Report] [Code] [Colab Tutorial] [Doc] [DEQ Zoo]
1-Step Diffusion Distillation via Deep Equilibrium Models
In Neural Information Processing Systems (NeurIPS) 2023
Zhengyang Geng*, Ashwini Pokle*, and J. Zico Kolter
TL;DR: Generative Equilibrium Transformer (GET) as strong 1-step diffusion learner.
[Paper] [Code] [BibTex]
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Tianle Cai*, Yuhong Li*, Zhengyang Geng, Hongwu Peng, Tri Dao
TL;DR: Simple LLM acceleration with multiple decoding heads and self-verification.
[Report] [Blog] [Code]
Equilibrium Image Denoising With Implicit Differentiation
In IEEE Transactions on Image Processing
Qi Chen, Yifei Wang, Zhengyang Geng, Yisen Wang, Jiansheng Yang, and Zhouchen Lin
TL;DR: Equilibrium image denoising with implicit differentiation.
[Paper] [BibTex]
Deep Equilibrium Approaches To Diffusion Models
In Neural Information Processing Systems (NeurIPS) 2022
Ashwini Pokle, Zhengyang Geng, and J. Zico Kolter
TL;DR: Parallel diffusion decoding via fixed point equations.
[Paper] [Code] [BibTex]
Eliminating Gradient Conflict in Reference-based Line-art Colorization
Zekun Li, Zhengyang Geng, Zhao Kang, Wenyu Chen, Yibo Yang
In Proceedings of European Conference on Computer Vision (ECCV) 2022
TL;DR: Investigating and alleviating gradient conflicts in attention training.
[Paper] [Code] [BibTex]
Deep Equilibrium Optical Flow Estimation
Shaojie Bai*, Zhengyang Geng*, Yash Savani, J. Zico Kolter (*equal contribution)
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022
TL;DR: Equilibrium solving as flow estimation. SoTA zero-shot generalization.

[Paper] [Code] [BibTex]
On Training Implicit Models
Zhengyang Geng*, Xin-Yu Zhang*, Shaojie Bai, Yisen Wang, Zhouchen Lin (*equal contribution)
In Neural Information Processing Systems (NeurIPS) 2021
TL;DR: Cheap, fast, and stable inexact gradient works as well as implicit differentiation.
[Paper] [Slides] [Poster] [Code] [BibTex]
Residual Relaxation for Multi-view Representation Learning
Yifei Wang, Zhengyang Geng, Feng Jiang, Chuming Li, Yisen Wang, Jiansheng Yang, Zhouchen Lin.
In Neural Information Processing Systems (NeurIPS) 2021
TL;DR: Equivariant contrastive learning replaces invariant contrastive learning.
[Paper] [Slides] [BibTex]
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng*, Meng-Hao Guo*, Hongxu Chen, Xia Li, Ke Wei, Zhouchen Lin. (*equal contribution)
In International Conference on Learning Representations (ICLR) 2021, top 3%.
TL;DR: Optimization (matrix decomposition) as attention.
[PDPaperF] [Code] [Blog Series 1 (zh), 2 (zh), 3 (zh)] [Poster] [BibTex]