Presenter

Details

  • Date: Monday, March 4, 2024
  • Time: 12:00 PM
  • Location: EECS, room 2311

Abstract

Scaling up the size of deep neural networks (DNNs) revolutionizes image and language tasks while sacrificing substantial power consumption, memory footprint, and runtime. To reduce the model size, structured matrices such as low-rank or block-sparse matrices replace the dense weights of DNNs. However, allocating the optimal layer-wise structure and cost of matrix-vector product (MVP) remains a challenge due to the non-differentiable and combinatorial nature of the problem. This talk introduces a differentiable learning framework for the compactly structured matrix for each weight of a DNN. First, we propose a Generalized Block Low-Rank (GBLR) format, a large set of structured matrices including many pivotal layouts. In addition, the parameters that define structure are fully differentiable through a Gaussian-Dirichlet (Gaudi) function, so the combinatorial optimization problem is directly converted to the continuous optimization problem. We demonstrate that Proximal Gradient Descent finds the unique undiscovered design and cost of MVP for each Gaudi-GBLR weight matrix in a DNN. The proposed layer-wise differentiable structure learning yields better accuracy-complexity tradeoff than the fixed and human-designed schemes.

Registration

Please fill out this survey to RSVP for pizza!


For any additional questions or inquiries, please contact us at speecs.seminar-requests@umich.edu