About me
Hi! I’m a CS Ph.D. student at LocusLab of CMU with Prof. Zico Kolter. Previously, I was a research assistant at ZERO Lab, School of AI, Peking University, working with Prof. Zhouchen Lin. Here is my CV. (Welcome to use this template!)
Research
I have eclectic interests in machine learning and deep learning, especially the dynamics in deep learning and the dynamics of deep learning. I believe that structured decomposition, i.e., disentanglement, is a key to understanding the emergence of intelligence, while it can be elegantly formulated and achieved by the dynamics.
- For the dynamics in deep learning, I study differentiable programming, nested optimization, and implicit models as the construction principle in neural networks.
- This is the “forward” pass.
- For the dynamics of deep learning, I try to understand the training dynamics of neural networks, especially the gradient issue when the networks are constructed by the dynamics. I am fascinated by their landscape. A strong perception drives me to believe that many problems in model design can be attributed to training.
- This is the “backward” pass.
I am also interested in developing principled learning methods for scientific problems through the dynamics.
Publications
Deep Equilibrium Approaches To Diffusion Models
In Neural Information Processing Systems (NeurIPS) 2022
Ashwini Pokle, Zhengyang Geng, and J. Zico Kolter
TL;DR: Non-autoregressive Diffusion model via a joint lower triangular equilibrium process.
[BibTex] [PDF] [Code]Eliminating Gradient Conflict in Reference-based Line-art Colorization
Zekun Li, Zhengyang Geng, Zhao Kang, Wenyu Chen, Yibo Yang
In Proceedings of European Conference on Computer Vision (ECCV) 2022
TL;DR: Avoid gradient conflicts in attention training.
[BibTex] [PDF] [Code]Deep Equilibrium Optical Flow Estimation
Shaojie Bai*, Zhengyang Geng*, Yash Savani, J. Zico Kolter (*equal contribution)
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022
TL;DR: Equilibrium solving as flow estimation, trained by inexact gradient and fixed point correction.
[BibTex] [PDF] [Code]On Training Implicit Models
Zhengyang Geng*, Xin-Yu Zhang*, Shaojie Bai, Yisen Wang, Zhouchen Lin (*equal contribution)
In Neural Information Processing Systems (NeurIPS) 2021
TL;DR: Cheap, fast, and stable inexact gradient works as well as exact implicit differentiation.
[BibTex] [PDF] [Code] [Slides] [Poster]Residual Relaxation for Multi-view Representation Learning
Yifei Wang, Zhengyang Geng, Feng Jiang, Chuming Li, Yisen Wang, Jiansheng Yang, Zhouchen Lin.
In Neural Information Processing Systems (NeurIPS) 2021
TL;DR: Equivariant contrastive learning replaces invariant contrastive learning.
[BibTex] [PDF] [Code] [Slides]Is Attention Better Than Matrix Decomposition?
Zhengyang Geng*, Meng-Hao Guo*, Hongxu Chen, Xia Li, Ke Wei, Zhouchen Lin. (*equal contribution)
In International Conference on Learning Representations (ICLR) 2021, top 3%.
TL;DR: 1-step gradient trained non-convex matrix recovery as a global context layer.
[BibTex] [PDF] [Code] [Blog Series 1 (zh), 2 (zh), 3 (zh)] [Poster]