About me
Hi! I’m a CS Ph.D. student at CMU advised by Zico Kolter. Previously, I was a research assistant at Peking University, advised by Zhouchen Lin. I also had a wonderful summer at Meta Reality Labs with Shaojie Bai working on generative avatar encoding.
I am an enthusiast of dynamics, recognizing, understanding, and developing dynamics that self-organize complex systems.
Research
I have eclectic interests in machine learning and deep learning, especially when combined with dynamics. I believe that structured decomposition (perception) and reconstruction (generation) are key to understanding the emergence of general intelligence, which dynamics can elegantly achieve.
- Regarding the “forward” pass, I study dynamical systems (fixed point equations, optimization, differential equations, etc) as the construction method and learning principle in neural networks.
- Regarding the “backward” pass, I pursue a better understanding of neural network training dynamics. I am fascinated by their landscape. A strong belief drives me to investigate the interaction between data/env, model, and learning dynamics.
I am also interested in modeling and understanding nature through dynamics. The only constant is change. The invariance under change is truth. :D
There is no need to anneal down life's "learning rate" too early. Even today, I often feel I should "restart" to extricate myself from too many AI papers📝or bubbles🫧.
— Zhengyang Geng (@ZhengyangGeng) April 12, 2024
AGI should offer people better childhood and teenage lives, not grasping people to serve and achieve itself.… https://t.co/JCrRfL3boU
Projects
Consistency Models Made Easy
Zhengyang Geng, William Luo, Ashwini Pokle, and J. Zico Kolter
TL;DR: Easy Consistency Tuning through Self Teacher :D
[Blog] [Code] [BibTex]TorchDEQ: A Library for Deep Equilibrium Models
Zhengyang Geng, and J. Zico Kolter
TL;DR: Modern Fixed Point Systems using Pytorch.
[Report] [Code] [Colab Tutorial] [Doc] [DEQ Zoo]1-Step Diffusion Distillation via Deep Equilibrium Models
In Neural Information Processing Systems (NeurIPS) 2023
Zhengyang Geng*, Ashwini Pokle*, and J. Zico Kolter
TL;DR: Generative Equilibrium Transformer (GET) as strong 1-step diffusion learner.
[Paper] [Code] [BibTex]Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Tianle Cai*, Yuhong Li*, Zhengyang Geng, Hongwu Peng, Tri Dao
TL;DR: Simple LLM acceleration with multiple decoding heads and self-verification.
[Report] [Blog] [Code]Equilibrium Image Denoising With Implicit Differentiation
In IEEE Transactions on Image Processing
Qi Chen, Yifei Wang, Zhengyang Geng, Yisen Wang, Jiansheng Yang, and Zhouchen Lin
TL;DR: Equilibrium image denoising with implicit differentiation.
[Paper] [BibTex]Deep Equilibrium Approaches To Diffusion Models
In Neural Information Processing Systems (NeurIPS) 2022
Ashwini Pokle, Zhengyang Geng, and J. Zico Kolter
TL;DR: Parallel diffusion decoding via fixed point equations.
[Paper] [Code] [BibTex]Eliminating Gradient Conflict in Reference-based Line-art Colorization
Zekun Li, Zhengyang Geng, Zhao Kang, Wenyu Chen, Yibo Yang
In Proceedings of European Conference on Computer Vision (ECCV) 2022
TL;DR: Investigating and alleviating gradient conflicts in attention training.
[Paper] [Code] [BibTex]Deep Equilibrium Optical Flow Estimation
Shaojie Bai*, Zhengyang Geng*, Yash Savani, J. Zico Kolter (*equal contribution)
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022
TL;DR: Equilibrium solving as flow estimation. SoTA zero-shot generalization.
[Paper] [Code] [BibTex]On Training Implicit Models
Zhengyang Geng*, Xin-Yu Zhang*, Shaojie Bai, Yisen Wang, Zhouchen Lin (*equal contribution)
In Neural Information Processing Systems (NeurIPS) 2021
TL;DR: Cheap, fast, and stable inexact gradient works as well as implicit differentiation.
[Paper] [Slides] [Poster] [Code] [BibTex]Residual Relaxation for Multi-view Representation Learning
Yifei Wang, Zhengyang Geng, Feng Jiang, Chuming Li, Yisen Wang, Jiansheng Yang, Zhouchen Lin.
In Neural Information Processing Systems (NeurIPS) 2021
TL;DR: Equivariant contrastive learning replaces invariant contrastive learning.
[Paper] [Slides] [BibTex]- Is Attention Better Than Matrix Decomposition?
Zhengyang Geng*, Meng-Hao Guo*, Hongxu Chen, Xia Li, Ke Wei, Zhouchen Lin. (*equal contribution)
In International Conference on Learning Representations (ICLR) 2021, top 3%.
TL;DR: Optimization (matrix decomposition) as attention.
[PDPaperF] [Code] [Blog Series 1 (zh), 2 (zh), 3 (zh)] [Poster] [BibTex]