Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding

international conference on learning representations, 2015.

Cited by: 3531|Bibtex|Views131|Links
EI

Abstract:

Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources. To address this limitation, we introduce "deep compression", a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requ...More

Code:

Data:

Your rating :
0

 

Tags
Comments