Supervised and unsupervised co-training of adaptive activation functions in neural nets

PSL'11 Proceedings of the First IAPR TC3 conference on Partially Supervised Learning(2011)

引用 6|浏览0
暂无评分
摘要
In spite of the nice theoretical properties of mixtures of logistic activation functions, standard feedforward neural network with limited resources and gradient-descent optimization of the connection weights may practically fail in several, difficult learning tasks. Such tasks would be better faced by relying on a more appropriate, problem-specific basis of activation functions. The paper introduces a connectionist model which features adaptive activation functions. Each hidden unit in the network is associated with a specific pair (f(·), p(·)), where f(·) (the very activation) is modeled via a specialized neural network, and p(·) is a probabilistic measure of the likelihood of the unit itself being relevant to the computation of the output over the current input. While f(·) is optimized in a supervised manner (through a novel backpropagation scheme of the target outputs which do not suffer from the traditional phenomenon of "vanishing gradient" that occurs in standard backpropagation), p(·) is realized via a statistical parametric model learned through unsupervised estimation. The overall machine is implicitly a co-trained coupled model, where the topology chosen for learning each f(·) may vary on a unit-by-unit basis, resulting in a highly non-standard neural architecture.
更多
查看译文
关键词
difficult learning task,logistic activation function,non-standard neural architecture,unsupervised co-training,specialized neural network,adaptive activation function,statistical parametric model,activation function,standard feedforward neural network,hidden unit,connectionist model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要