Online Learning of Sparse Network Architectures

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

Modern neural network architectures can have as many as hundreds of millions of parameters. This makes them power and memory-hungry, and impedes running networks on resource-constrained devices such as phones. Sparse networks can achieve performance similar to that of dense networks, with a fraction of the parameters. However, sparsification is usually done as an afterthought, without benefits in the learning phase. In this thesis, we propose to simultaneously optimise network architecture and parameters. Apart from sparsity benefits, this eliminates the need to choose a particular network architecture in advance.

Keywords

machine learning, graphs, architecture learning, metalearning, sparsification

Citation