| Type of Document |
Master's Thesis |
| Author |
Guidry, Jamie Allison
|
| URN |
etd-11142012-040550 |
| Title |
Improving Discourse Structure Identification |
| Degree |
Master of Science in Engineering Science (M.S.E.S.) |
| Department |
Engineering Science (Interdepartmental Program) |
| Advisory Committee |
| Advisor Name |
Title |
| Knapp, Gerald |
Committee Chair |
| Harvey, Craig |
Committee Member |
| Ikuma, Laura |
Committee Member |
|
| Keywords |
- discourse
- rhetorical structure theory
- natural language processing
- semantic analysis
- parser
|
| Date of Defense |
2012-11-09 |
| Availability |
restricted |
Abstract
Rhetorical Structure Theory (Mann et al. 1988), a popular approach for analyzing discourse coherence, suggests that coherent text can be placed into a hierarchical organization of clauses. Identification of a text’s rhetorical structure through automatic discourse analysis is a crucial element for many of today’s Natural Language Processing tasks, but no sufficient tool is available. The current state-of -the-art discourse parser, SPADE (Soricut et al. 2003), is limited to parsing discourse within a single sentence. HILDA (Hernault et al. 2010) extends the parsing abilities of SPADE to the document level, but with a decrease in performance.
This study achieved document-level discourse parsing without sacrificing performance. Provided text was already segmented into elementary discourse units, the task of discourse parsing was separated into three steps: structuring, nuclearity labeling, and relation labeling. An algorithm was developed for classifying relation existence, nuclearity, and relation label that improved upon previous methods. New features were explored for all three steps to maintain state-of-the-art performance when parsing at the document-level.
|
| Files |
| Filename |
Size |
Approximate Download Time
(Hours:Minutes:Seconds) |
| 28.8 Modem |
56K Modem |
ISDN (64 Kb) |
ISDN (128 Kb) |
Higher-speed Access |
![[LSU]](http://etd.lsu.edu/images/restricted.gif) |
Guidry_thesis.pdf |
3.25 Mb |
00:15:03 |
00:07:44 |
00:06:46 |
00:03:23 |
00:00:17 |
![[LSU]](http://etd.lsu.edu/images/restricted.gif)
indicates that a file or directory is
accessible from the LSU campus network only.
|