On the Universally Optimal Activation Function for a Class of Residual Neural Networks

AppliedMath（2022）

引用 0|浏览1

暂无评分

摘要

While non-linear activation functions play vital roles in artificial neural networks, it is generally unclear how the non-linearity can improve the quality of function approximations. In this paper, we present a theoretical framework to rigorously analyze the performance gain of using non-linear activation functions for a class of residual neural networks (ResNets). In particular, we show that when the input features for the ResNet are uniformly chosen and orthogonal to each other, using non-linear activation functions to generate the ResNet output averagely outperforms using linear activation functions, and the performance gain can be explicitly computed. Moreover, we show that when the activation functions are chosen as polynomials with the degree much less than the dimension of the input features, the optimal activation functions can be precisely expressed in the form of Hermite polynomials. This demonstrates the role of Hermite polynomials in function approximations of ResNets.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要