![]() |
Parallel
Programming (CS 4230)
Fall 2012
|
![]() |
This course is a comprehensive exploration of parallel programming paradigms, examining core concepts, focusing on a subset of widely used contemporary parallel programmingmodels, and providing context with a small set of parallel algorithms. In the last few years, this area has been the subject of significant interest due to a number of factors. Most significantly, the advent of multi-core microprocessors has made parallel computing available to the masses. At the high end, major vendors of large-scale parallel systems, including IBM, and Cray, have recently introduced new parallel programming languages designed for applications that exploit tens of thousands of processors. Embedded devices can also be thought of as small multiprocessors. The convergence of these distinct markets offers an opportunity to finally provide application programmers with a productive way to express parallel computation.
The course will be structured as lectures, homeworks, programming assignments and a final project. Students will perform four programming projects to express algorithms using selected parallel programming models and measure their performance. The final project will consist of teams of 2-3 students who will implement codes by combining multiple programming models.
Prerequisites: CS 4400, or concurrent
35% | Programming projects(P1, P2, P3, P4) | |
20% | Written homeworks | |
5% | Participation | |
25% | Quiz and Final | |
15% | Final project |
![]() |
An Introduction to Parallel Programming by Peter Pacheco (ISBN: 978-0-12-374260-5). |
Date | Topics | Read | Assign | Notes |
21 Aug |
Introduction (ppt) (pdf)
Importance of parallel programming |
Chapter 1 | - | - |
23 Aug |
Introduction to parallel algorithms and correctness (ppt) (pdf)
Concerns for parallelism correctness and performance |
Chapter 1 | HW01 | Pthread sum code versions |
28 Aug |
Parallel Computing Platforms, Memory Systems and Models of Execution (ppt) (pdf)
A diversity of parallel architectures, taxonomy, and examples |
Chapter 2, 2.1-2.3, pgs. 15-46 | - | - |
30 Aug |
Memory Systems and Introduction to Shared Memory Programming (ppt) (pdf)
Deeper understanding of memory systems and getting ready for programming |
Ch. 2.4-2.4.3 (pgs. 47-52), 4.1-4.2 (pgs. 151-159), 5.1 (pgs. 209-215) | HW02 | - |
04 Sep |
Data Parallelism in OpenMP(ppt) (pdf)
Introduction to OpenMP and Parallel Loops |
Chapter 5.2-5.7, 5.10 (pgs. 216-241, 256-258) | - | Sun Ultrasparc T2 http://www.youtube.com/watch?v=2pFOivcJ74g&feature=relmfu |
06 Sep |
Data Dependences(ppt) (pdf)
Code restructuring techniques: permutation and tiling |
- | - | Dep Notes OpenMP distributions |
11 Sep | Data parallel algorithms | - | - | Watch videos: Implementing Domain Decompositions in OpenMP Dense Linear Algebra I |
13 Sep |
Data Locality (ppt) (pdf)
Code restructuring techniques: tiling, unroll-and-jam and scalar replacement |
- | - | Locality Notes |
18 Sep |
Singular Value Decomposition (ppt)
(pdf) Algorithm Description for SVD |
- | - | - |
20 Sep |
More Locality and Data Parallelism (ppt) (pdf)
Tiling example and Red-Blue example |
- | P02 | Locality versions |
25 Sep |
Data Parallelism (ppt) (pdf)
Finish Red-Blue example, Stencils |
- | - | Stencil notes |
27 Sep |
Lab Day
Go over programming assignment |
- | - | - |
02 Oct |
Breaking Dependences, and Introduction to Task Parallelism (ppt) (pdf)
Parallel sections, Producer-consumer parallelism |
- | - | - |
04 Oct |
(pdf)
OpenMP sections and tasks |
Chapter 5.8 (pgs. 241-450) | - | More Task Parallelism (ppt) |
16 Oct |
Midterm Review
- |
- | - | Task-parallel versions |
18 Oct |
-
Midterm |
- | - | - |
23 Oct |
Introduction to Message Passing (ppt)
(pdf) What is MPI? Complexities of a distributed address space |
Chapter 3.1-3.2, 3.4, pgs. 83-96, 101-106 | - | - |
25 Oct |
MPI Communication (ppt) (pdf)
Non-blocking communication, One-sided communication |
- | P03 | - |
30 Oct |
Putting it Together: N-Body (ppt) (pdf)
N-body |
Chapter 6.1 | - | - |
01 Nov |
Introduction to GPUs and CUDA (ppt)
(pdf) Architecture and programming constructs |
(Optional) CUDA Programming Guide | - | DDJ article |
06 Nov |
CUDA, cont. (ppt) (pdf)
SIMT execution model, divergent branches, memory hierarchy |
- | P04 | - |
08 Nov |
CUDA, cont. (ppt) (pdf)
More memory hierarchy and examples |
- | - | - |
13 Nov |
SIMD multimedia extensions (ppt) (pdf) (video)
Multimedia extension architectures, performance issues |
- | - | - |
15 Nov |
Sparse Algorithms (ppt) (pdf)
Sparse Graphs and Matrices |
Chapter 3.3,3.5-3.7 (pgs. 106-136) | - | - |
20 Nov |
Parallel Graph Algorithms (ppt) (pdf)
Tree Search, Traveling Salesperson Problem |
Chapter 6.2 | - | - |
27 Nov |
SSE SIMD review
- |
- | - | Examples, compile with "icc -O3 -msse3 -vec-report=3 |
29 Nov |
Course Retrospective and Future Directions for Parallel Computing (ppt)
(pdf) Where the field is going |
Chapter 7 | - | - |
04 Dec |
Project Presentations, Dry Run
- |
- | - | - |
06 Dec |
Project Presentations Poster Session
- |
- | - | - |
11 Dec |
-
- |
- | - | - |
13 Dec |
Final exam
- |
- | - | - |