Smart Sampling and Optimal Dimensionality Reduction of Big Data Using Compressed Sensing

Anastasios Maronidis,Elisavet Chatzilari,Spiros Nikolopoulos,Ioannis Kompatsiaris

Studies in Big Data（2016）

引用 0|浏览1

暂无评分

摘要

Handling big data poses as a huge challenge in the computer science community. Some of the most appealing research domains such as machine learning, computational biology and social networks are now overwhelmed with large-scale databases that need computationally demanding manipulation. Several techniques have been proposed for dealing with big data processing challenges including computational efficient implementations, like parallel and distributed architectures, but most approaches benefit from a dimensionality reduction and smart sampling step of the data. In this context, through a series of groundbreaking works, Compressed Sensing (CS) has emerged as a powerful mathematical framework providing a suite of conditions and methods that allow for an almost lossless and efficient data compression. The most surprising outcome of CS is the proof that random projections qualify as a close to optimal selection for transforming high-dimensional data into a low-dimensional space in a way that allows for their almost perfect reconstruction. The compression power along with the usage simplicity render CS an appealing method for optimal dimensionality reduction of big data. Although CS is renowned for its capability of providing succinct representations of the data, in this chapter we investigate its potential as a dimensionality reduction technique in the domain of image annotation. More specifically, our aim is to initially present the challenges stemming from the nature of big data problems, explain the basic principles, advantages and disadvantages of CS and identify potential ways of exploiting this theory in the domain of large-scale image annotation. Towards this end, a novel Hierarchical Compressed Sensing (HCS) method is proposed. The new method dramatically decreases the computational complexity, while displays robustness equal to the typical CS method. Besides, the connection between the sparsity level of the original dataset and the effectiveness of HCS is established through a series of artificial experiments. Finally, the proposed method is compared with the state-of-the-art dimensionality reduction technique of Principal Component Analysis. The performance results are encouraging, indicating a promising potential of the new method in large-scale image annotation.

查看译文

关键词

Smart sampling,Optimal dimensionality reduction,Compressed Sensing,Sparse representation,Scalable image annotation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要