Let g be a slowly increasing function of locally bounded variation defined on Rc, 1 ≤c≤d. We investigate when g can be an activation function of the hidden-layer units of three-layer neural networks that approximate continuous functions on compact sets. If the support of the Fourier transform of g includes a converging sequence of points with distinct distances from the origin, it can be an activation function without scaling. If and only if the support of its Fourier transform includes a point other than the origin, it can be an activation function with scaling. We also look for a condition on which an activation function can be used for approximation without rotation. Any nonpolynomial functions can be activation functions with scaling, and many familiar functions, such as sigmoid functions and radial basis functions, can be activation functions without scaling. With or without scaling, some of them defined on Rd can be used without rotation even if they are not spherically symmetric.

This content is only available as a PDF.
You do not currently have access to this content.