Automated database design for large-scale scientific applications

Automated database design for large-scale scientific applications(2007)

引用 23|浏览22
暂无评分
摘要
The need for large-scale scientific data management is today more pressing than ever, as modern sciences need to store and process terabyte-scale data volumes. Traditional systems, relying on filesystems and custom data access and processing code do not scale for multi-terabyte datasets. Therefore, supporting today's data-driven sciences requires the development of new data management capabilities. This Ph.D dissertation develops techniques that allow modern Database Management Systems (DBMS) to efficiently handle large scientific datasets. Several recent successful DBMS deployments target applications like astronomy, that manage collections of objects or observations (e.g. galaxies, spectra) and can easily store their data in a commercial relational DBMS. Query performance for such systems critically depends on the database physical design, the organization of database structures such as indexes and tables. This dissertation develops algorithms and tools for automating the physical design process. Our tools allow databases to tune themselves, providing efficient query execution in the presence of large data volumes and complex query workloads. For more complex applications dealing with multidimensional and time-varying data, standard relational DBMS are inadequate. Efficiently supporting such applications requires the development of novel indexing and query processing techniques. This dissertation develops an indexing technique for unstructured tetrahedral meshes, a multidimensional data organization used in finite element analysis applications. Our technique outperforms existing multidimensional indexing techniques and has the advantage that can easily be integrated with standard DBMS, providing existing systems with the ability to handle spatial data with minor modifications.
更多
查看译文
关键词
recent successful DBMS deployment,multidimensional data organization,large data volume,standard DBMS,large-scale scientific data management,large-scale scientific application,process terabyte-scale data volume,spatial data,new data management capability,custom data access,time-varying data,Automated database design
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要