User:Mbeine
From REU@MU
Contents
About Me
(As of Summer 2011) I am a student at Marquette University majoring in Computer Science. I have completed two years here at MU, and I plan to graduate at the end of the coming academic year. I am participating in the 2011 MSCS Summer REU program, working with Dr. Rong Ge doing research on GPGPU.
Project Page: Analysis of GPGPU Algorithms for Multiplying Large Matrices
Research Progress
At a minimum, progress updates will be posted here every Friday during the program.
Progress can be checked against the schedule on the wiki page for the project.
Week 1
5/31
- 2011 Summer REU program began
- Introduction stuff and paperwork
6/1 and 6/2
- Took time to become familiar with the Slayer
- Read CUDA tutorials and sample programs
- Practice coding, compiling, and running CUDA programs
6/3
- Read through papers on SUMMA and CUSUMMA. Started with a quick read of each for familiarity, followed by close readings of each.
- Began reading paper on Benchmarking GPUs
Week 2
6/6
- Reread SUMMA and CUSUMMA papers
- Discussed above papers with Dr. Rong Ge
- Read paper on Benchmarking GPUs and paper on Optimization Principles of a Multithreaded GPU
6/7
- Implemented a program to test the CUBLAS library function for matrix multiplication
- Lots of debugging. There were also issues with linking.
6/8
- Got the linker to work properly
- Worked on timing the execution
6/9
- Finished implementing timing system
- Collected data
- Lunch with other REU students
- Collected more data
6/10
- Began implementing CUSUMMA algorithm. Lots of work on the getPartitionSize function, and began on the actual CUSUMMA implementation.
Week 3
6/13 and 6/14
- Programming CUSUMMA
6/15
- Collected run-time data for implementations of CUSUMMA
6/16
- Collected more data on CUSUMMA
- REU Lunch
- Attended required NSF presentation
6/17
- Debugging my implementation of CUSUMMA some more
- Collecting more data with updated code
- Paper discussion with Brylow and his students
Week 4
6/20
- Modified CUSUMMA implementation
- Collected data
6/21
- Another modified CUSUMMA implementation
- Collected data
- Read PUMMA paper
6/22
- Collected more data
- Data analysis
- Began working with lmbench
6/23
- Continued trying to figure out lmbench
- REU Lunch
- Paper discussion with Brylow's lab group
6/24
- Continued work with lmbench
- Reread paper on Benchmarking GPUs
- Compared CUSUMMA data with CUBLAS baseline
Week 5
6/27
- Started working on lat_mem_rd for GPU
- Began preparing presentation for Thursday
6/28
- Worked on presentation
- Reread PUMMA paper
- Began implementing CUPUMMA
6/29
- Finished simple implementation of CUPUMMA
- Collected data
6/30
- Finalized presentation
- Presentations + lunch
- Worked on improving CUPUMMA
7/1
- Improved CUPUMMA implementation
- Collected data
- Paper discussion with Brylow's lab group
Week 6
7/5
- Data collection
- Implemented CUPUMMA with data packing, and collected more data
7/6
- Continued data collection
7/7 and 7/8
- Benchmarking memory transfers between CPU and GPU
- Implemented another CUPUMMA version and collected data
Week 7
- Collected more CUPUMMA data - noticed some strange behavior
- Working to understand strange behavior, and found other unexpected behavior in CUSUMMA in the process
- Working to understand that, too
Week 8
7/18
- Still trying to figure out issues
- Ultimately figured that CUSUMMA issue was just an issue with precision
- Still no explanation for CUPUMMA behavior
7/19
- Looking for BLAS functions for CPU - found none on the Slayer
- Trying to test CUPUMMA some more
7/20-7/22
- Overlapping communication and computation in CUSUMMA
- Installing ATLAS and using it for a CPU baseline
- Collecting data on each
Week 9
- Poster
- More data collection
- Experimenting with partition sizes
- Trying to install MAGMA
- Paper Discussion