Type of Document Dissertation Author Wang, Xinqi URN etd-03082012-073140 Title Semantically-Aware Data Discovery and Placement in Collaborative Computing Environments Degree Doctor of Philosophy (Ph.D.) Department Computer Science Advisory Committee
Advisor Name Title Chen, Jianhua Committee Chair Kosar, Tevfik Committee Co-Chair Adkins, William A. Committee Member Allen, Gabrielle Committee Member Busch, Konstantin Committee Member Keywords
- Data Placement
- Distributed Computing
- Data Management
- Metadata Management
- Data Replication
- Cloud Computing
Date of Defense 2012-01-27 Availability unrestricted Abstract
As the size of scientific datasets and the demand for interdisciplinary collaboration grow in modern science, it becomes imperative that better ways of discovering and placing datasets generated across multiple disciplines be developed to facilitate interdisciplinary scientific research.
For discovering relevant data out of large-scale interdisciplinary datasets. The development and integration of cross-domain metadata is critical as metadata serves as the key guideline for organizing data. To develop and integrate cross-domain metadata management systems in interdisciplinary collaborative computing environment, three key issues need to be addressed: the development of a cross-domain metadata schema; the implementation of a metadata management system based on this schema; the integration of the metadata system into existing distributed computing infrastructure.
Current research in metadata management in distributed computing environment largely focuses on relatively simple schema that lacks the underlying descriptive power to adequately address semantic heterogeneity often found in interdisciplinary science. And current work does not take adequate consideration the issue of scalability in large-scale data management.
Another key issue in data management is data placement, due to the increasing size of scientific datasets, the overhead incurred as a result of transferring data among different nodes also grow into a significant inhibiting factor affecting overall performance. Currently, few data placement strategies take into consideration semantic information concerning data content.
In this dissertation, we propose a cross-domain metadata system in a collaborative distributed computing environment and identify and evaluate key factors and processes involved in a successful cross-domain metadata system with the goal of facilitating data discovery in collaborative environments. This will allow researchers/users to conduct interdisciplinary science in the context of large-scale datasets that will make it easier to access interdisciplinary datasets, reduce barrier to collaboration, reduce cost of future development of similar systems.
We also investigate data placement strategies that involve semantic information about the hardware and network environment as well as domain information in
the form of semantic metadata so that semantic locality could be utilized in data placement, that could potentially reduce overhead for accessing large-scale interdisciplinary datasets.
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access wang_diss.pdf 2.48 Mb 00:11:28 00:05:54 00:05:09 00:02:34 00:00:13
If you have questions or technical problems, please Contact LSU-ETD Support.