About me

Hi! I’m a CS Ph.D. student at CMU advised by Zico Kolter. Previously, I was a research assistant at Peking University, advised by Zhouchen Lin. I also had a wonderful summer at Meta Reality Labs with Shaojie Bai working on generative avatar encoding.

I am an enthusiast of dynamics, recognizing, understanding, and developing dynamics that self-organize complex systems.

Research

I have eclectic interests in machine learning and deep learning, especially when combined with dynamics. I believe that structured decomposition (perception) and reconstruction (generation) are key to understanding the emergence of general intelligence, which dynamics can elegantly achieve.

  • Regarding the “forward” pass, I study dynamical systems (fixed point equations, optimization, differential equations, etc) as the construction method and learning principle in neural networks.
  • Regarding the “backward” pass, I pursue a better understanding of neural network training dynamics. I am fascinated by their landscape. A strong belief drives me to investigate the interaction between data/env, model, and learning dynamics.

I am also interested in modeling and understanding nature through dynamics. The only constant is change. The invariance under change is truth. :D

Twitter

Projects

  • Consistency Models Made Easy
    Zhengyang Geng, William Luo, Ashwini Pokle, and J. Zico Kolter
    TL;DR: Easy Consistency Tuning through Self Teacher :D
    [Blog] [Code] [BibTex]

  • TorchDEQ Logo

    TorchDEQ: A Library for Deep Equilibrium Models

    Zhengyang Geng, and J. Zico Kolter

    TL;DR: Modern Fixed Point Systems using Pytorch.
    [Report] [Code] [Colab Tutorial] [Doc] [DEQ Zoo]

  • 1-Step Diffusion Distillation via Deep Equilibrium Models
    In Neural Information Processing Systems (NeurIPS) 2023
    Zhengyang Geng*, Ashwini Pokle*, and J. Zico Kolter
    TL;DR: Generative Equilibrium Transformer (GET) as strong 1-step diffusion learner.
    [Paper] [Code] [BibTex]

  • Medusa Logo

    Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

    Tianle Cai*, Yuhong Li*, Zhengyang Geng, Hongwu Peng, Tri Dao

    TL;DR: Simple LLM acceleration with multiple decoding heads and self-verification.
    [Report] [Blog] [Code]

  • Equilibrium Image Denoising With Implicit Differentiation
    In IEEE Transactions on Image Processing
    Qi Chen, Yifei Wang, Zhengyang Geng, Yisen Wang, Jiansheng Yang, and Zhouchen Lin
    TL;DR: Equilibrium image denoising with implicit differentiation.
    [Paper] [BibTex]

  • Deep Equilibrium Approaches To Diffusion Models
    In Neural Information Processing Systems (NeurIPS) 2022
    Ashwini Pokle, Zhengyang Geng, and J. Zico Kolter
    TL;DR: Parallel diffusion decoding via fixed point equations.
    [Paper] [Code] [BibTex]

  • Eliminating Gradient Conflict in Reference-based Line-art Colorization
    Zekun Li, Zhengyang Geng, Zhao Kang, Wenyu Chen, Yibo Yang
    In Proceedings of European Conference on Computer Vision (ECCV) 2022
    TL;DR: Investigating and alleviating gradient conflicts in attention training.
    [Paper] [Code] [BibTex]

  • Deep Equilibrium Optical Flow Estimation
    Shaojie Bai*, Zhengyang Geng*, Yash Savani, J. Zico Kolter (*equal contribution)
    In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022
    TL;DR: Equilibrium solving as flow estimation. SoTA zero-shot generalization.
    PWC
    [Paper] [Code] [BibTex]

  • On Training Implicit Models
    Zhengyang Geng*, Xin-Yu Zhang*, Shaojie Bai, Yisen Wang, Zhouchen Lin (*equal contribution)
    In Neural Information Processing Systems (NeurIPS) 2021
    TL;DR: Cheap, fast, and stable inexact gradient works as well as implicit differentiation.
    [Paper] [Slides] [Poster] [Code] [BibTex]

  • Residual Relaxation for Multi-view Representation Learning
    Yifei Wang, Zhengyang Geng, Feng Jiang, Chuming Li, Yisen Wang, Jiansheng Yang, Zhouchen Lin.
    In Neural Information Processing Systems (NeurIPS) 2021
    TL;DR: Equivariant contrastive learning replaces invariant contrastive learning.
    [Paper] [Slides] [BibTex]

  • Is Attention Better Than Matrix Decomposition?
    Zhengyang Geng*, Meng-Hao Guo*, Hongxu Chen, Xia Li, Ke Wei, Zhouchen Lin. (*equal contribution)
    In International Conference on Learning Representations (ICLR) 2021, top 3%.
    TL;DR: Optimization (matrix decomposition) as attention.
    [PDPaperF] [Code] [Blog Series 1 (zh), 2 (zh), 3 (zh)] [Poster] [BibTex]