Models of Computation for Massive Data
Instructor : Jeff Phillips
Fall 2013 | Wednesday, (sometimes Fridays) 4:35 pm - 5:45 pm
MEB 3147
Catalog number: CS 7931 001
Description:
This course will explore advanced models of computation pertinent for processing massive data sets.
As data sets grow to terabyte and petabyte scales, traditional models and paradigms of sequential computation become obsolete.
Different efficiency trade-offs analyzing memory usage, I/O calls, or inter-node communication become the dominant bottlenecks.
These paradigms are formalized as I/O-Efficient, Parallel, Streaming, GPU-based, Map-Reduce, and other distributed algorithmic models of computation.
This course will study the history and specifics of these models.
Students in the class will learn the proper settings in which to use these paradigms, the advantages and disadvantages of each model, and how to analyze algorithms with these settings.
Students in the class are lucky to use resources from EmuLab and Amazon Web Services.
Project Sechedule:
Here is the schedule for the project.
Schedule:
1 credit or 3 credit: The class can be taken for 1 credit or 3 credits.
If you take the class as 1 credit, then I expect you to attend all (or almost all) of the lectures. If I feel attendence is lagging, I may give pop quizes. Alternatively, I may require that 1 credit students either present a lecture or scribe lecture notes.
If you take the class for 3 credits, you must also do a project. The project will be quite independent, but I will try to provide feedback when asked for. It will also require a short presentation at the end of the semester.
To take the class for 3 credits, the project must be approved by the instructor!
Grading: As this will be taught as an advanced seminar, grading will be fairly forgiving. Those making an effort to do well, will recieve high marks.
Scribe Assignments and Video Assignments.
Project: (Only required for 3-credit students.)
A major component of this class will be a project where you will investigate in-depth a focused topic in a particular model or the relation between several models. Details.
FAQ: Q: Will there be a book?
A: No. Most information is available online and I will post links on the (under-construction) webpage.
Q: I took undergraduate algorithms, but not graduate algorithms. Will I be lost in the analysis?
A: My intention is "no." I expect you to understand big-O notation and how to analyze basic algorithms in it. But this class will not be as intense or extensive in its analysis as CS 6150 (Advanced Algorithms). We will apply big-O analysis to analyze aspects other than just run-time, and the goal of this course will be to teach you how and when to do that.