Title page for ETD etd-1109103-231721


Type of Document Dissertation
Author Grant, Kevin Paul
URN etd-1109103-231721
Title Machine Learning Techniques for Efficient Query Processing in Knowledge Base Systems
Degree Doctor of Philosophy (Ph.D.)
Department Computer Science
Advisory Committee
Advisor Name Title
Jianhua Chen Committee Chair
Donald Kraft Committee Member
Robert Mathews Committee Member
Sukhamay Kundu Committee Member
Evangelos Triantaphyllou Dean's Representative
Keywords
  • machine learning
  • knowledge base systems
  • probabilistic heuristic estimates
  • query processing
Date of Defense 2003-10-16
Availability unrestricted
Abstract
In this dissertation we propose a new technique for efficient query processing in knowledge base systems. Query processing in knowledge base systems poses strong computational challenges because of the presence of combinatorial explosion. This arises because at any point during query processing there may be too many subqueries available for further exploration. Overcoming this difficulty requires effective mechanisms for choosing from among these subqueries good subqueries for further processing.

Inspired by existing works on stochastic logic programs, compositional modeling and probabilistic heuristic estimates we create a new, nondeterministic method to accomplish the task of subquery selection for query processing. Specifically, we use probabilistic heuristic estimates to make the necessary decisions. This approach combines subquery and knowledge base properties and previous query processing experience with conditional probability theory to derive a probability of success for each subquery. The probabilities of success are used to select the next subquery for further processing. The underlying, property-specific probabilities of success are learned via a machine learning process involving a set of training sample queries.

In this dissertation we present our new methodology and the algorithms used to accomplish both the training and query processing phases of the system. We also present a method for determining the minimum training set size needed to achieve probability estimates with any desired limit on the maximum size of the errors.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Grant_dis.pdf 425.83 Kb 00:01:58 00:01:00 00:00:53 00:00:26 00:00:02

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact LSU-ETD Support.