Title page for ETD etd-07052006-234928

Type of Document Master's Thesis
Author Tang, Qing
Author's Email Address qtang1@lsu.edu
URN etd-07052006-234928
Title Two-Dimensional Penalized Signal Regression for Hand Written Digit Recognition
Degree Master of Applied Statistics (M.Ap.Stat.)
Department Experimental Statistics
Advisory Committee
Advisor Name Title
Brian D. Marx Committee Chair
James P. Geaghan Committee Member
Kevin S. McCarter Committee Member
  • handwritten digit recognition
  • USPS zip code
  • B-spline
  • P-spline signal regression (PSR)
Date of Defense 2006-05-08
Availability unrestricted
Many attempts have been made to achieve successful recognition of handwritten digits. We report our results of using statistical method on handwritten digit recognition. A digitized handwritten numeral can be represented by an image with grayscales. The image includes features that are mapped into two-dimensional space with row and column coordinates. Based on this structure, two-dimensional penalized signal logistic regression (PSR) is applied to the recognition of handwritten digits.

The data set is taken from the USPS zip code database that contains 7219 training images and 2007 test images. All the images have been deslanted and normalized into 16 x 16 pixels with various grayscales. The PSR method constructs a coefficient surface using a rich two-dimensional tensor product B-splines basis, so that the surface is more flexible than needed. We then penalize roughness of the coefficient surface with difference penalties on each coefficient associate with the rows and columns of the tensor product B-splines. The optimal penalty weight is found in several minutes of iterative operations. A competitive overall recognition error rate of 8.97% on the test data set was achieved.

We will also review an artificial neural network approach for comparison. By using PSR, it requires neither long learning time nor large memory resources. Another advantage of the PSR method is that our results are obtained on the original USPS data set without any further image preprocessing. We also found that PSR algorithm was very capable to cope with high diversity and variation that were two major features of handwritten digits.

  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Tang_thesis.pdf 800.17 Kb 00:03:42 00:01:54 00:01:40 00:00:50 00:00:04

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact LSU-ETD Support.