Rethinking the Relationship between Recurrent and Non-Recurrent Neural Networks: A Study in Sparsity
CoRR(2024)
摘要
Neural networks (NN) can be divided into two broad categories, recurrent and
non-recurrent. Both types of neural networks are popular and extensively
studied, but they are often treated as distinct families of machine learning
algorithms. In this position paper, we argue that there is a closer
relationship between these two types of neural networks than is normally
appreciated. We show that many common neural network models, such as Recurrent
Neural Networks (RNN), Multi-Layer Perceptrons (MLP), and even deep multi-layer
transformers, can all be represented as iterative maps.
The close relationship between RNNs and other types of NNs should not be
surprising. In particular, RNNs are known to be Turing complete, and therefore
capable of representing any computable function (such as any other types of
NNs), but herein we argue that the relationship runs deeper and is more
practical than this. For example, RNNs are often thought to be more difficult
to train than other types of NNs, with RNNs being plagued by issues such as
vanishing or exploding gradients. However, as we demonstrate in this paper,
MLPs, RNNs, and many other NNs lie on a continuum, and this perspective leads
to several insights that illuminate both theoretical and practical aspects of
NNs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要