Chapter XVIII Indexing Regional Objects in High-Dimensional Spaces

semanticscholar(2016)

引用 0|浏览0
暂无评分
摘要
Many spatial access methods, such as the R-tree, have been designed to support spatial search operators (e.g., overlap, containment, and enclosure) over both points and regional objects in multi-dimensional spaces. Unfortunately, contemporary spatial access methods are limited by many problems that significantly degrade the query performance in high-dimensional spaces. This chapter reviews the problems of contemporary spatial access methods in spaces with many dimensions and presents an efficient approach to building advanced spatial access methods that effectively attack these problems. It also discusses the importance of high-dimensional spatial access methods for the emerging database applications, such as location-based services. INTRODUCTION There is a large body of literature on accessing data in high-dimensional spaces: Berchtold, Bohm, and Kriegel (1998); Berchtold, Keim, and Kriegel (1996), Lin, Jagadish, and Faloutsos (1995), Orlandic and Yu (2002), Sakurai, Yoshikawa, Uemura, and Kojima (2000), Weber, Schek, and Blott (1998), and White and Jain (1996). However, the proposed techniques almost always assume data sets representing points in the space. In many applications, effective representation of extended (regional) data is also important. IDEA GROUP PUBLISHING This paper appears in the publication, Advanced Topics in Database Research, vol. 5 edited by Keng Siau © 2006, Idea Group Inc. 701 E. Chocolate Avenue, Suite 200, Hershey PA 17033-1240, USA Tel: 717/533-8845; Fax 717/533-8661; URL-http://www.idea-group.com ITB13077 Indexing Regional Objects in High-Dimensional Spaces 349 Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. Regional data are usually associated with low-dimensional spaces of geographic applications. However, due to approximation, aggregation or clustering, such data may naturally appear in high-dimensional spaces as well. For example, when the massive high-dimensional data of advanced scientific applications are clustered in files on tertiary storage, storage considerations often prevent the corresponding access structure from keeping the descriptors of all items in the repository. Instead, the content of each file can be approximated in the access structure by the minimal bounding rectangle (MBR) enclosing all data points in the given file (Orlandic, 2003). Similarly, in order to reduce the cost of dynamic updates, multi-dimensional databases of location-based services frequently approximate the position of a moving object by the bounded rectangle of a larger area in which the object currently resides. Since the position is usually only one of many relevant parameters describing a moving object, the index (access) structure appropriate for these environments must deal with regional data in spaces with possibly many dimensions. The storage and retrieval of regional data representing moving objects are discussed later in this chapter. Other applications in which regional objects naturally appear in high-dimensional spaces include multimedia and image-recognition systems. In these applications, objects are usually mapped onto long d-dimensional feature vectors. For the purposes of recognition, the feature vectors are projected onto a “reduced space” defined by c ≤ d principal components of the data (Swets & Weng, 1996). After populating the reduced space, images are grouped into classes, each of which can be represented by its approximate region and stored in a spatial access method. In order to identify the most likely class for the given object, the image recognition system must employ a form of spatial retrieval with a probabilistic ranking of the retrieved objects. Unlike point access methods (PAMs), spatial access methods (SAMs) are designed to support different search operators (e.g., overlap, containment, and enclosure) over both points and regional objects in multi-dimensional spaces (Gaede & Gunther, 1998). Unfortunately, contemporary SAMs are limited by many problems, including some conceptual flaws that have a tendency to accelerate as dimensionality increases. The problems significantly degrade query performance in high-dimensional spaces. This chapter reviews the problems of contemporary SAMs and presents an efficient approach to building advanced SAM techniques that effectively attack the limitations of traditional spatial access methods in spaces with many dimensions. The approach is based on three complementary measures. Through a special kind of object transformation, the first measure addresses the conceptual flaws of previous SAMs. The second measure reduces the number of false drops into index pages that contain no object satisfying the query. The third measure addresses a structural degradation of the underlying index. The resulting technique, called the cQSF-tree, is not the ultimate achievement in the area of indexing regional data in high-dimensional spaces. However, it effectively attacks the limitations of traditional SAMs in spaces with many dimensions. The results of an extensive experimental study (presented later in this chapter) show that the performance improvements also increase with more skewed data distributions. In the experiments, the sQSF-tree (Yu, Orlandic, & Evens, 1999) and an optimized version of the R*-tree (Beckmann, Kriegel, Schneider, & Seeger, 1990; Papadias, Theodoridis, Sellis, & Egenhofer, 1995) are used as benchmarks for comparison. 24 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the product's webpage: www.igi-global.com/chapter/indexing-regional-objects-highdimensional/4400?camid=4v1 This title is available in InfoSci-Books, InfoSci-Database Technologies, Library Science, Information Studies, and Education, InfoSci-Library and Information Science, InfoSciComputer Science and Information Technology, Science, Engineering, and Information Technology, Advances in Database Research, InfoSci-Select, InfoSci-Select. Recommend this product to your librarian: www.igi-global.com/e-resources/libraryrecommendation/?id=1
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要