Abstract
Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model’s architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectively search over a wide range of architectures at the cost of a single training run. To facilitate this search, we develop a flexible mechanism based on memory read-writes that allows us to define a wide range of network connectivity patterns, with ResNet, DenseNet, and FractalNet blocks as special cases. We validate our method (SMASH) on CIFAR-10 and CIFAR-100, STL-10, ModelNet10, and Imagenet32x32, achieving competitive performance with similarly-sized hand-designed networks.
| Original language | English |
|---|---|
| Publication status | Published - May 2018 |
| Event | 6th International Conference on Learning Representations 2018 - Vancouver Convention Center, Vancouver, Canada Duration: 30 Apr 2018 → 3 May 2018 Conference number: 6 https://iclr.cc/ |
Conference
| Conference | 6th International Conference on Learning Representations 2018 |
|---|---|
| Abbreviated title | ICLR 2018 |
| Country/Territory | Canada |
| City | Vancouver |
| Period | 30/04/18 → 3/05/18 |
| Internet address |
Keywords
- Deep Learning
- Hypernetworks
- Transfer leanring
- Benchmarking
ASJC Scopus subject areas
- Language and Linguistics
- Education
- Computer Science Applications
- Linguistics and Language
Fingerprint
Dive into the research topics of 'SmaSH: One-shot model architecture search through hypernetworks'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver