CS 6530 – Advanced Database SystemsLectures: TuTh / 10:45AM-12:05PM MT at LNCO 1110 Instructor: Prashant Pandey
Teaching Assistants:
Course OverviewThis course is a comprehensive study of the internals of modern database systems and the challenges of indexing and querying large-scale data in the context of continuously evolving hardware. It will cover the core concepts and fundamentals of indexing and hashing data structures, concurrency control, storage, file organization, and query processing. The course will study both the in-memory and disk-based database systems and will use examples from modern key-value stores. All the class projects will be in the context of real in-memory and disk-based database systems. The course is appropriate for graduate students in software systems and for advanced undergraduates with systems programming skills. PrerequisitesUnofficial Prerequisites: CS 5530 (Undergrad databases), CS 3505 software practice in C/C++. You should know, or be willing to learn quickly by yourself, the programming language C++ for the projects. Here is a good C++ tutorial. Course Topics
Projects
Paper ReadingThere is a set of assigned paper readings for the course. The reading list is designed to provide additional information and insight into the current state-of-the-art database systems research. Each student is required to pick five papers from the reading list and turn in a one-paragraph synopsis of each of the five papers. There will be five deadlines throughout the semester when students would be required to submit the synopsis. Late submissions will not be accepted without prior approval from the instructor. Each review must include the following information:
These reading reviews must be your own writing. You may not copy from the papers or other sources that you find on the web. Plagiarism will not be tolerated. Useful ResourcesPlease refer to this brief overview of asymptotic notations The Asymptotic Cheat Sheet. This will help you easily follow theoretical analyses in the course. Grading
Late submission policy
Collaboration and PlagiarismEveryone needs to read the SoC Policy on Academic Misconduct. Working with others on assignment is a good way to learn the material and we encourage it. However, there are limits to the degree of cooperation that we will permit. When working on programming assignments, you must work only with others whose understanding of the material is approximately equal to yours. In this situation, working together to find a good approach for solving a programming problem is cooperation; listening while someone dictates a solution is cheating. You must limit collaboration to a high-level discussion of solution strategies, and stop short of actually writing down a group answer. Anything that you hand in, whether it is a paper report or a computer program, must be written in your own words. If you base your solution on any other written solution, you are cheating. If you collaborate with other students to discuss a problem and then write your own solution, make sure to declare upfront in the write up names of all the students you collaborated with. Never look at another student's code or share your code with any other student. You must not make your code public (on github or by any other means). Tools like Github Copilot, ChatGPT, and copying code from sites like Stack Overflow also constitutes cheating. Do not write code with Copilot enabled in this course. We do not distinguish between cheaters who copy other's work and cheaters who allow their work to be copied. If you cheat, you will be given an E in the course and referred to the University Student Behavior Committee. Clearly, any attempt to subvert the ordinary grading process constitutes cheating. If you have any questions about what constitutes cheating, please ask first. |