Revision as of 06:25, 12 June 2015 by Brylow (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

About Me

(As of Summer 2011) I am a student at Marquette University majoring in Computer Science. I have completed two years here at MU, and I plan to graduate at the end of the coming academic year. I am participating in the 2011 MSCS Summer REU program, working with Dr. Rong Ge doing research on GPGPU.

Project Page: Analysis of GPGPU Algorithms for Multiplying Large Matrices

Research Progress

At a minimum, progress updates will be posted here every Friday during the program.

Progress can be checked against the schedule on the wiki page for the project.

Week 1


  • 2011 Summer REU program began
  • Introduction stuff and paperwork

6/1 and 6/2

  • Took time to become familiar with the Slayer
  • Read CUDA tutorials and sample programs
  • Practice coding, compiling, and running CUDA programs


  • Read through papers on SUMMA and CUSUMMA. Started with a quick read of each for familiarity, followed by close readings of each.
  • Began reading paper on Benchmarking GPUs

Week 2


  • Reread SUMMA and CUSUMMA papers
  • Discussed above papers with Dr. Rong Ge
  • Read paper on Benchmarking GPUs and paper on Optimization Principles of a Multithreaded GPU


  • Implemented a program to test the CUBLAS library function for matrix multiplication
  • Lots of debugging. There were also issues with linking.


  • Got the linker to work properly
  • Worked on timing the execution


  • Finished implementing timing system
  • Collected data
  • Lunch with other REU students
  • Collected more data


  • Began implementing CUSUMMA algorithm. Lots of work on the getPartitionSize function, and began on the actual CUSUMMA implementation.

Week 3

6/13 and 6/14

  • Programming CUSUMMA


  • Collected run-time data for implementations of CUSUMMA


  • Collected more data on CUSUMMA
  • REU Lunch
  • Attended required NSF presentation


  • Debugging my implementation of CUSUMMA some more
  • Collecting more data with updated code
  • Paper discussion with Brylow and his students

Week 4


  • Modified CUSUMMA implementation
  • Collected data


  • Another modified CUSUMMA implementation
  • Collected data
  • Read PUMMA paper


  • Collected more data
  • Data analysis
  • Began working with lmbench


  • Continued trying to figure out lmbench
  • REU Lunch
  • Paper discussion with Brylow's lab group


  • Continued work with lmbench
  • Reread paper on Benchmarking GPUs
  • Compared CUSUMMA data with CUBLAS baseline

Week 5


  • Started working on lat_mem_rd for GPU
  • Began preparing presentation for Thursday


  • Worked on presentation
  • Reread PUMMA paper
  • Began implementing CUPUMMA


  • Finished simple implementation of CUPUMMA
  • Collected data


  • Finalized presentation
  • Presentations + lunch
  • Worked on improving CUPUMMA


  • Improved CUPUMMA implementation
  • Collected data
  • Paper discussion with Brylow's lab group

Week 6


  • Data collection
  • Implemented CUPUMMA with data packing, and collected more data


  • Continued data collection

7/7 and 7/8

  • Benchmarking memory transfers between CPU and GPU
  • Implemented another CUPUMMA version and collected data

Week 7

  • Collected more CUPUMMA data - noticed some strange behavior
  • Working to understand strange behavior, and found other unexpected behavior in CUSUMMA in the process
  • Working to understand that, too

Week 8


  • Still trying to figure out issues
  • Ultimately figured that CUSUMMA issue was just an issue with precision
  • Still no explanation for CUPUMMA behavior


  • Looking for BLAS functions for CPU - found none on the Slayer
  • Trying to test CUPUMMA some more


  • Overlapping communication and computation in CUSUMMA
  • Installing ATLAS and using it for a CPU baseline
  • Collecting data on each

Week 9

  • Poster
  • More data collection
  • Experimenting with partition sizes
  • Trying to install MAGMA
  • Paper Discussion