Project #3: Logging and RecoveryProject OverviewThe third programming project will teach you how to implement write-ahead logging, checkpointing, and recovery in a disk-based key-value store. The primary goal of this assignment is to become familiar with the low-level implementation details of write-optimized key-value stores and to learn how to implement write-ahead logging (WAL) and recovery to bring the index back to a consistent state after a crash. All the code in this programming assignment must be written in C++. If you have not used C++ before, here's a short tutorial on the language. Even if you are familiar with C++, go over this guide for additional information on writing code in the system. Here are some resources to learn about write-ahead logging and key-value store. This is a group project that will be completed in groups of two or three students. The student groups will be based on your responses to the Project2 groups. If you plan to diverge from the already specified group please first seek permission from the instructor. This project will have three deadlines. The first milestone will require the students to submit a design document.
Implementation DetailsIn this assignment, you will need to add the following functionality in the key-value store:
In this assignment, you will need to modify the following existing files:
You can locally modify the You will also need to write a report and submit that with the final source code. There are four steps to completing this project:
We will deal with a single-threaded version of the key-value store. The index involved in this project does not currently support transactions. Each operation will act as an individual entity. However, you will need to think of dependencies among nodes during a split/merge operation in the tree and ensure that the tree can be recovered to a consistent state after a crash. Step #1 - Design the logging/recovery functionalityThe first step is to understand how the key-value store works and then write a design document on how you would add the logging and recovery functionality. You should first build the make mkdir tmpdir ./test -m benchmark-upserts -d tmpdirThe test benchmark has a help mode that explains the various arguments required to run it. In order to properly understand the key-value store please read the README file and the comments at the top of source files. You can also read this paper to understand the internals of the Bε-tree. Bε-tree is at the heart of this index. The design document should include the following items:
Step #2 - Implement write ahead logging (WAL)This part will require you to implement write-ahead logging. As part of the logging functionality, you will need to create a file-backed logger where you will append the update operations before inserting them into the key-value store. The changes are first recorded in the log, which must be written to stable storage, before the changes are written to the database. Every operation that modifies the key-value store state has to be logged on disk before the contents on the associated nodes in the tree can be modified. After the changes are appended, the log file needs to be persisted to disk. The system can only acknowledge the user after the changes are persisted to disk. The logger can persist the file after adding every update operation or after a fixed number of changes. The log persist granularity can affect the performance of the index. To implement the log file, you can implement your own file handling code or you can also make use of the backing_store API already provided in the source code. Here's a short tutorial on file handling in C++. Step #3 - Implement checkpointingThis part will require you to implement the checkpoint operation using the log file. A "Checkpoint" operation transfers the write-ahead logging file changes into the key-value store. Once the changes from the log file are inserted into the key-value store and corresponding nodes are written back to disk you need to purge those log entries. The checkpointing operation needs to be performed at regular intervals. Depending on the granularity of the checkpointing operation, the user might get stale results for some duration when they query the key-value store. Similar to the log persist granularity, the checkpointing granularity can affect the staleness guarantees of the index. Step #4 - Implement recoveryThis part will require you to implement recovery after a crash. The recovery is the first thing that is called when a system comes back up after a crash. The recovery logic will check in the log file to determine any changes that are not committed to the key-value store yet. This can be done by looking at the size of the log file on disk and the last checkpoint index. The recovery function must replay remaining changes from the log and update the checkpoint index. The recovery must also implement a function in the key-value store to reconstruct the tree by reading nodes from disk after a crash. This can be done using the serialization/deserialization methods already implemented in the data structure. TipsIf you're not sure how to start, try splitting the work into smaller objectives:
InstructionsYou can download the Project #3 source code (as a tar file) from Canvas. It is uploaded under files. You can extract the source code using the following command: unzip project3.zip To debug any correctness issues, you can compile the -O3 flag when doing benchmarking.
You will use the Cade cluster to finish this project. CADE manages clusters that you can use to do your development and testing for all of the class projects. You are free to use other machines and environments, but all grading will be done on these machines. Please test your solutions on these machines.
Check with CADE if you need to setup an account. CADE machines all share your home directory, so you needn't log in to the same machine each time to continue working.
After you have an account choose a machine at random from the lab status page from
the lab1- set of machines (that is,
ssh lab1-10.eng.utah.edu CADE user accounts have tcsh set as their default shell. Each time you login first run bash before anything else. All instructions, examples, and scripts from this class assume you are using bash as your shell. You'll need to do this each time unless you reset your default shell ( link) (which I'd recommend). Perhaps, savvy users can provide slick setups. This step is important. If you don't reset your shell, other things will mysteriously break as you try to work through the labs. Essential software are installed on all Cade lab1 machines. SubmissionYou need to submit a You should also include a report.pdf is included separately and not a part of the .zip file.
We will evaluate the correctness and the performance of your implementation off-line after the project due date. Collaboration Policy
|