Title page for ETD etd-05292008-103454


Type of Document Master's Thesis
Author Chen, Hongyi
URN etd-05292008-103454
Title Data Exploration by Using the Monotonicity Property
Degree Master of Science in Systems Science (M.S.S.S.)
Department Computer Science
Advisory Committee
Advisor Name Title
Evangelos Triantaphyllou Committee Chair
Jianhua Chen Committee Member
Warren Liao Committee Member
Keywords
  • Boolean function
  • binary system
  • weighted voting system
  • monotonicity property
  • probability
  • misclassification cost
  • classifier
  • data mining
Date of Defense 2008-05-12
Availability unrestricted
Abstract
Dealing with different misclassification costs has been a big problem for classification. Some algorithms can predict quite accurately when assuming the misclassification costs for each class are the same, like most rule induction methods. However, when the misclassification costs change, which is a common phenomenon in reality, these algorithms are not capable of adjusting their results. Some other algorithms, like the Bayesian methods, have the ability to yield probabilities of a certain unclassified example belonging to given classes, which is helpful to make modification on the results according to different misclassification costs. The shortcoming of such algorithms is, when the misclassification costs for each class are the same, they do not generate the most accurate results.

This thesis attempts to incorporate the merits of both kinds of algorithms into one. That is, to develop a new algorithm which can predict relatively accurately and can adjust to the change of misclassification costs.

The strategy of the new algorithm is to create a weighted voting system. A weighted voting system will evaluate the evidence of the new example belonging to each class, calculate the assessment of probabilities for the example, and assign the example to a certain class according to the probabilities as well as the misclassification costs.

The main problem of creating a weighted voting system is to decide the optimal weights of the individual votes. To solve this problem, we will mainly refer to the monotonicity property. People have found the monotonicity property does not only exist in pure monotone systems, but also exists in non-monotone systems. Since the study of the monotonicity property has been a huge success on monotone systems, it is only natural to apply the monotonicity property to non-monotone systems too.

This thesis deals only with binary systems. Though such systems hardly exist in practice, this treatment provides concrete ideas for the development of general solution algorithms.

After the final algorithm has been formulated, it has been tested on a wide range of randomly generated synthetic datasets. It has also been compared with other existing classifiers. The results indicate this algorithm performs both effectively and efficiently.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Chen_thesis.pdf 3.13 Mb 00:14:29 00:07:26 00:06:31 00:03:15 00:00:16

Browse All Available ETDs by ( Author | Department )

If you have questions or technical problems, please Contact LSU-ETD Support.