FitNets: Hints for Thin Deep Nets

international conference on learning representations, 2015.

Cited by: 1060|Bibtex|Views142|Links
EI

Abstract:

While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a...More

Code:

Data:

Your rating :
0

 

Tags
Comments