Skip to yearly menu bar Skip to main content


Spotlight Poster

Theoretical Limitations of Ensembles in the Age of Overparameterization

Niclas Dern · John Cunningham · Geoff Pleiss

[ ] [ Project Page ]
Thu 17 Jul 11 a.m. PDT — 1:30 p.m. PDT
 
Oral presentation: Oral 5E Learning Theory
Thu 17 Jul 10 a.m. PDT — 11 a.m. PDT

Abstract:

Classic ensembles generalize better than any single component model. In contrast, recent empirical studies find that modern ensembles of (overparameterized) neural networks may not provide any inherent generalization advantage over single but larger neural networks. This paper clarifies how modern overparameterized ensembles differ from their classic underparameterized counterparts, using ensembles of random feature (RF) regressors as a basis for developing theory. In contrast to the underparameterized regime, where ensembling typically induces regularization and increases generalization, we prove with minimal assumptions that infinite ensembles of overparameterized RF regressors become pointwise equivalent to (single) infinite-width RF regressors, and finite width ensembles rapidly converge to single models with the same parameter budget. These results, which are exact for ridgeless models and approximate for small ridge penalties, imply that overparameterized ensembles and single large models exhibit nearly identical generalization. We further characterize the predictive variance amongst ensemble members, demonstrating that it quantifies the expected effects of increasing capacity rather than capturing any conventional notion of uncertainty. Our results challenge common assumptions about the advantages of ensembles in overparameterized settings, prompting a reconsideration of how well intuitions from underparameterized ensembles transfer to deep ensembles and the overparameterized regime.

Live content is unavailable. Log in and register to view live content