Title page for ETD etd-03082012-073140

Type of Document Dissertation
Author Wang, Xinqi
URN etd-03082012-073140
Title Semantically-Aware Data Discovery and Placement in Collaborative Computing Environments
Degree Doctor of Philosophy (Ph.D.)
Department Computer Science
Advisory Committee
Advisor Name Title
Chen, Jianhua Committee Chair
Kosar, Tevfik Committee Co-Chair
Adkins, William A. Committee Member
Allen, Gabrielle Committee Member
Busch, Konstantin Committee Member
  • Ontology
  • Data Placement
  • Distributed Computing
  • Data Management
  • Metadata Management
  • Data Replication
  • Cloud Computing
Date of Defense 2012-01-27
Availability unrestricted

As the size of scientific datasets and the demand for interdisciplinary collaboration grow in modern science, it becomes imperative that better ways of discovering and placing datasets generated across multiple disciplines be developed to facilitate interdisciplinary scientific research.

For discovering relevant data out of large-scale interdisciplinary datasets. The development and integration of cross-domain metadata is critical as metadata serves as the key guideline for organizing data. To develop and integrate cross-domain metadata management systems in interdisciplinary collaborative computing environment, three key issues need to be addressed: the development of a cross-domain metadata schema; the implementation of a metadata management system based on this schema; the integration of the metadata system into existing distributed computing infrastructure.

Current research in metadata management in distributed computing environment largely focuses on relatively simple schema that lacks the underlying descriptive power to adequately address semantic heterogeneity often found in interdisciplinary science. And current work does not take adequate consideration the issue of scalability in large-scale data management.

Another key issue in data management is data placement, due to the increasing size of scientific datasets, the overhead incurred as a result of transferring data among different nodes also grow into a significant inhibiting factor affecting overall performance. Currently, few data placement strategies take into consideration semantic information concerning data content.

In this dissertation, we propose a cross-domain metadata system in a collaborative distributed computing environment and identify and evaluate key factors and processes involved in a successful cross-domain metadata system with the goal of facilitating data discovery in collaborative environments. This will allow researchers/users to conduct interdisciplinary science in the context of large-scale datasets that will make it easier to access interdisciplinary datasets, reduce barrier to collaboration, reduce cost of future development of similar systems.

We also investigate data placement strategies that involve semantic information about the hardware and network environment as well as domain information in

the form of semantic metadata so that semantic locality could be utilized in data placement, that could potentially reduce overhead for accessing large-scale interdisciplinary datasets.

  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  wang_diss.pdf 2.48 Mb 00:11:28 00:05:54 00:05:09 00:02:34 00:00:13

Browse All Available ETDs by ( Author | Department )

If you have questions or technical problems, please Contact LSU-ETD Support.