Homework 4: Page memory allocator and an address space

This assignment will teach you to implement a simple page memory allocator and how to construct a simple address space. This assignments builds on top of your previous HW3 submission (Boot Into C), i.e., you will extend the code of HW3 to implement this additional functionality for HW4. If you don't have a working HW3 submission talk to us.

Technically, you can do this assignment on any operating system that supports the Unix API and can run Qemu (CADE machines, your laptop that runs Linux or Linux VM, and even MacOS, etc.). You don't need to set up xv6 for this assignment, but if you're running on CADE you'll have to install QEMU, see QEMU setup instructions. Submit your programs and the shell through Gradescope (see instructions at the bottom of this page).

NOTE: YOU CANNOT PUBLICLY RELEASE SOLUTIONS TO THIS HOMEWORK. It's ok to show your work to your future employer as a private Git repo, however any public release is prohibited.

Overview

At a high level our goal is to learn how to construct an address space, i.e., a page table that we can use to load a program in memory. Don't forget, our ultimate goal is to execute programs. (Not surprisingly, loading and executing a program in memory will be our next homework, but today the things are simple, we only construct an address space). To construct the address space we need two building blocks: 1) a memory allocator that keeps track of available physical pages, and 2) a function that allows us to map a physical page at a specific virtual address. Below, we learn how to build the two.

Page memory allocator

Our first task is to implement a simple page memory allocator. At a high level your assignment is to implement two functions: page_alloc() and page_free(int pa). The int page_alloc() function should return a physical address of a free page as integer. Similarly, page_free(int pa) takes a physical address of the page and adds it back to the pool of free physical pages.

Now, we need a way to keep track of available physical pages. We will implement this data structure as an array (each element of the array represents one physical page) linked with a list so we can quickly allocate the next available page. Specifically, we will use the following data structure to keep track of physical pages in the system:

#define MAX_PAGES 8
#define NIL 0xFFFFFFFF

struct page {
    unsigned int next;
};

struct free_pages {
    unsigned int head;
    struct page pages[MAX_PAGES];
};

unsigned int page_alloc(); 
void page_free(unsinged int pa); 
Here the pages array tracks the state of every page in the system. Each element of the array corresponds to a single page. Page with physical page number (ppn) of 0 (address 0x0), is element 0 of the array, ppn 1 is element 1, etc. The next field links the free pages as illustrated below (the example for MAX_PAGES of 8). The head points to the 5th element of the array (this means that page with ppn 5 is free, and the next page on the list is page with ppn 1, and the last page on the list is page with ppn 3.) Here we use NIL (a constant that identifies an invalid next pointer) to specify the end of the list.
head: 5 --------------+     
              +---+   |     
              |   |   |  
              |   v   v     
           -----------------
           | |3| |N| |1| | |
           -----------------
            0 1 2 3 4 5 6 7 
              ^       |     
	      |       |
              +-------+

The page_alloc() function should return physical address of the page with ppn 5, and will update the list like this:
head: 1 -+     
         |    +---+        
         |    |   |     
         |    |   v        
         | -----------------
         | | |3| |N| | | | |
         | -----------------
         |  0 1 2 3 4 5 6 7 
         |    ^
         |    |            
         +----+
The page_free(unsigned int pa) function takes the physical address of the page, and adds this page back to the free list. For example, if you invoke page_free(0x7000) the list changes like:
head: 7 ------------------+     
              +---+       | 
              |   |       |
              |   v       v 
           -----------------
           | |3| |N| | | |1|
           -----------------
            0 1 2 3 4 5 6 7 
              ^           |
              |           | 
              +-----------+
You assignment is to implement this linked list to support 32MB of physical memory. Similar to xv6 you should add the end symbol to your kernel (remember, you have to change the linker script to export this symbol, and an extern declaration in C), add pages from the kernel end to the 32MB to your allocator. To add the end symbol to your linker script, use something like
PROVIDE(end = .);
Similar to kernel.ld in xv6.

Mapping a page

Having a memory allocator working, we are ready to construct the address space (i.e., a page table) with our new mapping function:

int mmap(pde_t *pgdir, unsigned int va, unsigned int pa, unsigned int flag)
This function takes a pointer to the page table directory and maps the page with the physical address pa to the virtual address va. If the inner levels of the page table are not present you should allocate and add them to the page table. Your assignment is to implement the mmap() function.

Test

First, download pagealloc.c. Add to your hw3. All your code for this hw should be in this file. The Makefiile will automatically pick up all .c files and compile them.

Second, download test.c. This contains definition of test method that will call your implementation of mmap() and page_free() method to create a new mapping from 0 - 32 MB. Then modify your main() function to call test() after setting the page table and before invocation of halt().

Hints

To finish the homework you can follow the code of xv6 - kalloc.c (in fact it's allowed to copy and paste code from xv6 verbatim if it helps you).

FAQ

What is the point of setting up a page allocator if we can set up paging like in hw3?

Page allocator represents amount of physical memory in your system. This abstraction will allow us to do other things in the next homework, such as allocate pages to load an executable in memory.

Why don't we use other approach to setting up page allocator, such as a bitmap?

You can. But it is slow. In a 32bit system, you will need to keep track of 2^20 = 1,048,576 number of pages. The allocation can get very slow due to a long scan of the bitmap. The linked list is always O(1) and the constant is small.

What should my page_alloc return?

A physical address of the page, i.e., it has to be PGSIZE or 4096 aligned.

Should I call page_alloc() in my mmap()?

Yes, remember your pagetable is a tree. Even if the top level of the tree is allocated, the inner levels (page tables) will need some space, i.e., a physical page that you should get from mmap().

What is purpose of end symbol?

The end symbol helps us to initialize your allocator correctly. Specifically, the end symbol marks the end of the kernel binary, i.e., its text, data, bss and potentially other sections, in memory. We don't want to add physical pages in which the kernel is allocated to the allocator and reuse them for something else.

I don't have a working HW3

If you don't have a working HW3 (Boot Into C) homework, talk to us. We will either debug your solution, or give you ours.

Submit your work

Submit your solution through Gradescope Gradescope CS5460/6460 Operating Systems. Please zip all of your files and submit them.. The structure of the zip file should be the following:

/
  - test.c
  - pagealloc.c
  - main.c
  - boot.asm
  - linker.ld
  - multiboot_header.asm
  - boot/grub.cfg
  - ...                             -- any other files required to start
Updated: March, 2024