Presenter

Details

  • Date: Monday, September 25, 2023
  • Time: 12:00 PM
  • Location: EECS, room 2311

Abstract

Over the past decade, deep learning has proven to be a highly effective method for extracting meaningful features from high-dimensional data. This work attempts to unveil the mystery of feature learning in deep networks. Specifically, for a multi-class classification problem, we explore how the features of training data evolve across the intermediate layers of a trained neural network. We investigate this problem based on simple deep linear networks trained on nearly orthogonal data, and we analyze how the output features in each layer concentrate around the means of their respective classes. Remarkably, when the deep linear network is trained using gradient descent from a small orthogonal initialization, we theoretically prove that the features exhibit a linear decay in the measure of within-class feature variability as we move from shallow to deep layers. Moreover, our extensive experiments not only validate our theoretical findings numerically but also reveal a similar pattern in deep nonlinear networks which well aligns with recent empirical studies.

Registration

Please fill out this survey to RSVP for pizza!


For any additional questions or inquiries, please contact us at speecs.seminar-requests@umich.edu