YINS Seminar Archives: Boaz Barak (Nov. 11, 2020)

YINS Seminar Archives: Boaz Barak (Nov. 11, 2020)

Talk Summary: 

 “Understanding generalization requires rethinking deep learning?”

Speaker: Boaz Barak, Harvard University
In classical statistical learning theory, we can place bounds on the generalization gap - the difference between the empirical performance of a learned classifier on its training set and the population performance on unseen test examples. Such bounds are hard to prove for deep learning. There is also empirical evidence that they are simply not true and deep-learning algorithms actually do have non-vanishing generalization gaps.  
 
In this talk we will see that there is a variant of supervised deep learning that does have small generalization gaps, both in practice and in theory. This variant is “Self-Supervised + Simple fit” (SSS) algorithms that are obtained by first using self-supervision to learn a complex representation of the (label free) training data, and then fitting a simple (e.g., linear) classifier to the labels. Such classifiers have become increasingly popular in recent years, as they offer several practical advantages and have been shown to approach state-of-art results.  
 
We show that (under assumptions described below) the generalization gap of such classifiers tends to zero as long as the complexity of the simple classifier is asymptotically smaller than the number of training samples. Our bound is independent of the complexity of the representation,  which can use an arbitrarily large number of parameters. Our bound holds assuming that the learning algorithm satisfies certain noise-robustness (adding a small amount of label noise causes small degradation in performance) and rationality (getting the wrong label is not better than getting no label at all) properties.  These conditions hold widely across many standard architectures. We complement this result with an empirical study, demonstrating that the generalization gap is in fact small in practice and our bound is non-vacuous for many popular representation-learning based classifiers on CIFAR-10 and ImageNet, including SimCLR, AMDIM and BigBiGAN.  
 
The talk will not assume any specific background in machine learning, and should be accessible to a general mathematical audience. Joint work with Yamini Bansal and Gal Kaplun.
 
This presentation was part of the YINS Weekly Seminar Series and was presented on Wednesday, November 11, 2020. 
Speaker: 
Boaz Barak (Harvard University)
Bio: 
Gordon McKay Professor of Computer Science
Harvard John A. Paulson School of Engineering and Applied Sciences