CS6450 Distributed Systems

Final Project

Purpose

We have focused on many key elements of distributed systems in class: replication, consistency, fault-tolerance, transactions, etc. Each of these topics are areas of continuous and active research. The goal of the project is for you to discover some of this work for yourself and to demonstrate proficiency in reading and explaining recent research and its context.

Even so, my hope is that the process isn't too heavyweight.

Below is a list of several recent papers from OS and distributed systems venues. For this project, you will pair up with one other student and choose one paper than corresponds to a "sub-thread" below (e.g. 1A or 2B, etc). You will both read the paper thoroughly. Think of yourself as an advocate for the paper; put yourself in the shoes of the authors and try to put the paper into context and understand what its contributions are. In short, try to get your peers excited about the paper (unless you weren't excited about it; in that case, convince us why we shouldn't be excited about it). In this way, this differs somewhat from a seminar: you are responsible for explaining your paper well and your project grade partly depends on it.

This means your reading needs to be a deep look at the paper. You will likely have to read the paper a few times and spend a few hours discussing it with your partner.

By the time your present the paper to the class you should:

  1. Understand the content of the paper to the extent you are capable.
    • If a paper includes complex material like extended proofs in an Appendix you needn't understand all the mechanics of that content. Regardless, though: if the paper includes a proof or a proof sketch in the main body text take the time to understand it and its intuition. Consider whether, with the paper in hand, you could explain each part to someone.
    • For some papers this will be easier than others.
    • If you are really struggling with the paper you picked come see me and we can discuss.
    • As a result, please don't delay in taking a first-pass read of your selected paper in case it doesn't suit your interests.
  2. Put the paper in a larger context in order to help explain the paper's contribution.
    • For example, if you read NOPaxos you may have to take a brief look at the other Paxos variants (Fast Paxos, Spec Paxos) they discuss to get an intuition for where NOPaxos pushes the state of the art and where it doesn't.
    • For some papers this will be easier than others.
  3. Explain the core contribution of the paper. Be sure you can answer these questions:
    • What is new here?
    • What are the benefits of the approach?
  4. Explain the mechanics of the paper and the interesting implementation details (if applicable).
  5. Be able to answer questions about the paper.
    • Each presentation will include time for questions.
    • However, you aren't expected to have all the answers. Its always okay to say "I don't know".

Paired Paper Discussions

Each of the papers below is "paired" with another. Not all the pairings make perfect sense, but I roughly tried to group papers together than are related. This means that another pair of students is reading something that is related to what you are reading.

At least a week in advance of the first presentation, set up a meeting with the group that is reading the other paper in your pair. Describe your papers to one another and compare the key ideas. Coordinate on presentation: for example, the ideas Thread 4A/B build on one another. The Paper 4A sets up an algorithm and a bound, Paper 4B describes applying the approach in a programmable network switch and why switches are the right place for this approach. If there are similar relationships between two papers in a thread try to determine: should one paper be presented before the other? And are some of the ideas/background better explained in one presentation or the other? For example, in Thread 4 the first presentation might describe the cache load balancing algorithm, give its intuition, and show why it holds. The second presentation could then be freed to discuss its application in a new context and could focus, for example, on experimental results.

There are no specific requirements for this cross-group discussion, but please record what you learned from the discussion in a small portion of the final report and presentation.

Presentation Requirements

15 minutes with 5 minutes for Questions and Answers

Any presentation format is okay, but creating a slide deck is probably the most straightforward approach. Think carefully about what you want to say; 15 minutes is extremely short. There is only time for a handful of slides (a good rule of thumb is that a slide takes about 2 minutes to explain on average).

Many of these papers have recorded lectures, presentations, and associated slide decks. I highly encourage seeking out this material. It is fine to use what you find online to help you put together your presentation. However, you are responsible for your presentation, so don't simply recycle someone else's slide deck. Conference talks often jump into the middle of a context; consider your audience try to convey the material well to the students in the class.

Make sure both members of the group get a chance to present.

Final Report Requirements

3 pages total, 11 pt font Times, 2 columns, 1 inch margins, one report per pair of students

The report must demonstrate both group members have read and understood the paper well.

In about 1.5 pages, the report should in your own words: explain the problem the work solves; explain the context of the problem; situate the work with other related work; explain the key aspects of the work. In about another 1.5 pages, the report should explain what makes the work interesting/exciting/promising; explain what limits the work; highlight any future directions that the work seems to put forward.

Reading

Several people have advice on how to read papers; How to Read a Paper is a good example if you are struggling with reading. No particular approach to reading is required/enforced, but each student must read the paper their group chooses. Each student is also expected to contribute to the final report and presentation work.

Expectations, Requirements, and Deadlines

  1. Find a partner for the assignment
    • Deadline: 9/30.
  2. Choose a paper from below, skim it to ensure it is interesting to you.
    • Deadline: 9/30.
  3. One member of the group: email stutsman@cs.utah.edu.
    • Indicate your group members and which paper you are interested in reading.
    • I will record which pairs will read which papers, and send an acknowledgement that you have the paper.
    • Papers will be given out first-come-first-serve.
    • If you want to read something that you don't see on the list, discuss it with me.
    • Deadline: 9/30.
  4. Read and study the paper.
  5. A week before presentations start spend an hour or two discussing your paper with the group reading the other paper paired with yours.
    • It may help to have some of your presentation materials sketched out to facilitate discussion.
    • Discuss what the overlap is between the papers.
    • Determine if they should be presented in some particular order.
    • Discuss whether any related ideas can be presented in one presentation or the other.
    • Deadline: 10/30.
  6. Prepare a presentation about the paper.
    • Deadline: 11/6.
  7. Present your paper. The presentation must
    • demonstrate both group members have read and understood the paper well;
    • explain the problem the work solves;
    • explain the context of the problem;
    • situate the work with other related work;
    • explain the key aspects of the work;
    • explain what makes the work interesting/exciting/promising;
    • explain what limits the work.
    • Submission: Email presentation materials to stutsman@cs.utah.edu
  8. Write a short 3-page report about your paper.
    • Overall, the report must demonstrate both group members have read and understood the paper well.
    • In about 1.5 pages, the report should
    • explain the problem the work solves;
    • explain the context of the problem;
    • situate the work with other related work;
    • explain the key aspects of the work.
    • In about 1.5 pages, the report should
    • explain what makes the work interesting/exciting/promising;
    • explain what limits the work;
    • highlight any future directions that the work seems to put forward.
    • Deadline: 11/27.
    • Submission: Email the report to stutsman@cs.utah.edu

Topic List

Spreadsheet of Who Has Which Papers - requires UoU login

Thread Topics Papers
1A
1B
Distributed Shared Memory Transactions Remote Direct Memory Access Kernel Bypass OS Networking Performance FaRM: Fast Remote Memory, NSDI'14
No compromises: distributed transactions with consistency, availability, and performance, SOSP'15
2A
2B - Taken
BFT Cryptocurrency Algorand: Scaling Byzantine Agreements for Cryptocurrencies, SOSP'17
Zyzzyva: Speculative Byzantine Fault Tolerance, SOSP'07
3A
3B
In-network Programming Transactions Consensus Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering, OSDI'16
Eris: Coordination-Free Consistent Transactions Using In-Network Concurrency Control, SOSP'17
4A
4B - Taken
In-network Programming Load Balancing Small cache, big effect: provable load balancing for randomly partitioned cluster services, SoCC'11
NetCache: Balancing Key-Value Stores with Fast In-Network Caching, SOSP'17
5A
5B
In-network Programming Chain Replication Fault-Tolerance Object Storage on CRAQ: High-Throughput Chain Replication for Read-Mostly Workloads, ATC'09
NetChain: Scale-Free Sub-RTT Coordination, NSDI'18 Best Paper
6A
6B
VM Replication SMR Consensus Fault-Tolerance Databases RemusDB: Transparent High Availability for Database Systems, VLDB'11
PLOVER: Fast, Multi-core Scalable Virtual Machine Fault-tolerance, NSDI'18
7A
7B
Kernel Bypass Dataplane Programming Load Balancing Tail Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency, OSDI'14 Best Paper
ZygOS: Achieving Low Tail Latency for Microsecond-scale Networked Tasks, SOSP'17
8A
8B
Replication Fault-Tolerance Consensus SMR vCorfu: A Cloud-Scale Object Store on a Shared Log, NSDI'17
There is more consensus in Egalitarian parliaments, SOSP'13
9A - Taken
9B
Kernel Bypass Dataplane Programming Network Function Virtualization Stateless Network Functions: Breaking the Tight Coupling of State and Processing, NSDI'17
Elastic Scaling of Stateful Network Functions, NSDI'18
10A - Taken
10B - Taken
Linearizability Transactions Replication Fault-Tolerance Building consistent transactions with inconsistent replication, SOSP'15
Implementing linearizability at large scale and low latency, SOSP'15
11A
11B - Taken
Kernel Bypass RDMA Concurrency Control Transactions Databases The end of a myth: distributed transactions can scale, VLDB'17
On the Design and Scalability of Distributed Shared-Data Databases, SIGMOD'15