About me

Hi! I’m a CS Ph.D. student at LocusLab of CMU with Prof. Zico Kolter. Previously, I was a research assistant at ZERO Lab, School of AI, Peking University, working with Prof. Zhouchen Lin. Here is my CV. (Welcome to use this template!)


I have eclectic interests in machine learning and deep learning, especially the dynamics in deep learning and the dynamics of deep learning. I believe that structured decomposition, i.e., disentanglement, is a key to understanding the emergence of intelligence, while it can be elegantly formulated and achieved by the dynamics.

  • For the dynamics in deep learning, I study differentiable programming, nested optimization, and implicit models as the construction principle in neural networks.
    • This is the “forward” pass.
  • For the dynamics of deep learning, I try to understand the training dynamics of neural networks, especially the gradient issue when the networks are constructed by the dynamics. I am fascinated by their landscape. A strong perception drives me to believe that many problems in model design can be attributed to training.
    • This is the “backward” pass.

I am also interested in developing principled learning methods for scientific problems through the dynamics.


  • Deep Equilibrium Approaches To Diffusion Models
    In Neural Information Processing Systems (NeurIPS) 2022
    Ashwini Pokle, Zhengyang Geng, and J. Zico Kolter
    TL;DR: Non-autoregressive Diffusion model via a joint lower triangular equilibrium process.
    [BibTex] [PDF] [Code]

  • Eliminating Gradient Conflict in Reference-based Line-art Colorization
    Zekun Li, Zhengyang Geng, Zhao Kang, Wenyu Chen, Yibo Yang
    In Proceedings of European Conference on Computer Vision (ECCV) 2022
    TL;DR: Avoid gradient conflicts in attention training.
    [BibTex] [PDF] [Code]

  • Deep Equilibrium Optical Flow Estimation
    Shaojie Bai*, Zhengyang Geng*, Yash Savani, J. Zico Kolter (*equal contribution)
    In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022
    TL;DR: Equilibrium solving as flow estimation, trained by inexact gradient and fixed point correction.
    [BibTex] [PDF] [Code]

  • On Training Implicit Models
    Zhengyang Geng*, Xin-Yu Zhang*, Shaojie Bai, Yisen Wang, Zhouchen Lin (*equal contribution)
    In Neural Information Processing Systems (NeurIPS) 2021
    TL;DR: Cheap, fast, and stable inexact gradient works as well as exact implicit differentiation.
    [BibTex] [PDF] [Code] [Slides] [Poster]

  • Residual Relaxation for Multi-view Representation Learning
    Yifei Wang, Zhengyang Geng, Feng Jiang, Chuming Li, Yisen Wang, Jiansheng Yang, Zhouchen Lin.
    In Neural Information Processing Systems (NeurIPS) 2021
    TL;DR: Equivariant contrastive learning replaces invariant contrastive learning.
    [BibTex] [PDF] [Code] [Slides]

  • Is Attention Better Than Matrix Decomposition?
    Zhengyang Geng*, Meng-Hao Guo*, Hongxu Chen, Xia Li, Ke Wei, Zhouchen Lin. (*equal contribution)
    In International Conference on Learning Representations (ICLR) 2021, top 3%.
    TL;DR: 1-step gradient trained non-convex matrix recovery as a global context layer.
    [BibTex] [PDF] [Code] [Blog Series 1 (zh), 2 (zh), 3 (zh)] [Poster]