It is known that any target function is realized in a sufficiently small neighborhood of any randomly connected deep network, provided the width (the number of neurons in a layer) is sufficiently large. There are sophisticated analytical theories and discussions concerning this striking fact, but rigorous theories are very complicated. We give an elementary geometrical proof by using a simple model for the purpose of elucidating its structure. We show that high-dimensional geometry plays a magical role. When we project a high-dimensional sphere of radius 1 to a low-dimensional subspace, the uniform distribution over the sphere shrinks to a gaussian distribution with negligibly small variances and covariances.

You do not currently have access to this content.