Type of Document Dissertation Author Sims, Gillian David URN etd-11102009-150925 Title Parallel Cloth Simulation Using OpenMP and CUDA Degree Doctor of Philosophy (Ph.D.) Department Human Ecology Advisory Committee
Advisor Name Title Kuttruff, Jenna T Committee Chair Belleau, Bonnie D Committee Member Book, Michael E Committee Member Chen, Jonathan Yan Committee Member O'Neil, Carol Elliot Committee Member Gillespie, Jeffrey M Dean's Representative Keywords
- Cloth Simulation
Date of Defense 2009-11-03 Availability unrestricted AbstractThe widespread availability of parallel computing architectures has lead to research regarding algorithms and techniques that best exploit available parallelism. In addition to the CPU parallelism available; the GPU has emerged as a parallel computational device. The goal of this study was to explore the combined use of CPU and GPU parallelism by developing a hybrid parallel CPU/GPU cloth simulation application. In order to evaluate the benefits of the hybrid approach, the application was first developed in sequential CPU form, followed by a parallel CPU form.
The application uses Backward Euler implicit time integration to solve the differential equations of motion associated with the physical system. The Conjugate Gradient (CG) algorithm is used to determine the solution vector for the system of equations formed by the Backward Euler approach. The matrix/vector, vector/vector, and vector/scalar operations required by CG are handled by calls to BLAS level 1 and level 2 functions. In the sequential CPU and parallel CPU versions, the Intel Math Kernel Library implementation of BLAS is used. In the hybrid parallel CPU/GPU version, the Nvidia CUDA based BLAS implementation (CUBLAS) is used. In the parallel CPU and hybrid implementations, OpenMP directives are used to parallelize the force application loop that traverses the list of forces acting on the system. Runtimes were collected for each version of the application while simulating cloth meshes with particle resolutions of 20x20, 40x40, and 60x60.
The performance of each version was compared at each mesh resolution. The level of performance degradation experienced when transitioning to the larger mesh sizes was also determined. The hybrid parallel CPU/GPU implementation yielded the highest frame rate for the 40x40 and 60x60 meshes. The parallel CPU implementation yielded the highest frame rate for the 20x20 mesh. The performance of the hybrid parallel CPU/GPU implementation degraded the least as it transitioned to the two larger mesh sizes. The results of this study will potentially lead to further research regarding the use of GPUs to perform the matrix/vector operations associated with the CG algorithm under more complex cloth simulation scenarios.
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access sims_diss.pdf 3.36 Mb 00:15:34 00:08:00 00:07:00 00:03:30 00:00:17
If you have questions or technical problems, please Contact LSU-ETD Support.