StuBS
A2: Paging in StuBSmI

The goal of this exercise is to extend StuBSmI with basic paging functionality, to isolate application processes from each other and to and to separate kernel and user code from each other. For this exercise, auxiliary files are provided in the template (tools/{init,imgbuilder}.cc).

In order to keep the initialization of paging simple, we design StuBSmI to be a "lower-half" kernel, i.e. the virtual addresses from 0x0 up to (e.g.) 32 MiB belong to the kernel and correspond to the physical addresses. The address range of the applications thus begins virtually directly where the kernel area ends. However, there we do not enforce an identity mapping between virtual and physical addresses.

Physical memory

First, we have to discover and manage the physical memory that is installed in our machine. We will need pages from that memory pool to create the data structures for the paging in the further course.

For this purpose, the boot loader provides the operating system with a list of all the available memory areas that are not occupied by devices or buses. This information is passed in a documented format via the multi-boot information. Here, the mmap_addr and mmap_length fields are of special interest as they provide us with an unsorted list of physical memory regions that are are Multiboot::Memory::isAvailable() or not. For our physical-memory pool, we are only interested in those regions that are surely available.

For this, it is to be made certain that memory areas already occupied by kernel and and initial ramdisk are not automatically excluded from this list, but must be but must be explicitly filtered out. For this, you can use the symbols ___KERNEL_START___ and ___KERNEL_END___ (see compiler/sections.ld). Furthermore, we should avoid the area below 1 MiB as it may include many memory-mapped legacy devices. In addition on some systems between 0x00F00000 and 0x00FFFFFF memory for ISA devices is faded in (ISA memory hole). OSdev.org has a detailed discussionof the x86 Memory Map. This memory should therefore also be filtered out.

Attention
The ranges specified by Multiboot can overlap and be contradictory. contradictory. In case of doubt non-free (reserved) has precedence.

Since parsing the Multiboot structure is somewhat idiosyncratic, we have already provided you with appropriate methods in the Multiboot namespace.

It is recommended to implement the physical-memory pool as a free memory bitmap.

Alternatively, you can use two linked lists of free pages. However, for this the link element cannot be located within the page itself, but you should use an large array with one pointer for each physical page (like Linux' struct page:

struct page {
    struct page *next_free_page;
};

struct page pages[1 << (32 - 12)];

Page-Table Tree (Preliminary)

In order to enable paging, we have to create the configuration for the memory-management unit (MMU) to describe the mapping between virtual addresses to physical addresses. To prepare this, you should have some kind of page allocator for physical memory for:

  • Free physical pages above the user–kernel split (> 32 MiB)
  • Free physical pages below the user–kernel split (< 32 MiB)

As a first step, we only establish the identity mapping below 32 MiB, which allows us to enable paging. For this, you have to initialize one page directory and eight page tables to cover the first 32 MiB of the address space.

Detailed information about paging can be found in the Intel manual in chapter 4. It makes sense not to map the first page (from address 0x0) permanently (to provoke a page fault when accessed). You should also pay attention to the memory-mapped device memory of the IOAPIC and the LAPIC. In this task it is not (yet) necessary to implement a handling for pagefaults. However, this can be useful for debugging.

Separation of kernel and user code

In order to be able to isolate between kernel and applications, StuBSmI must be compiled separately from the applications. The build system must be adapted accordingly. Additionally a library "libsys" should be created, in which the syscall stubs for the applications. With the help of this library, each application should be compiled by itself, without linking directly to the kernel or or to #include parts of it. In order to call the constructors of global objects in the application, you should link the supplied init.cc into each application.

For the linking, you require an linker script that describes the final structure of the executable file. This script should, among other things, define the starting address of the user space (32 MiB). Start by adapting the kernel's linker script (compiler/sections.ld).

With the help of the program objcopy tool, you can generate so-called "flat" binaries from the application ELF's. A flat binary is a complete memory image that can be loaded (without relocation of parsing) at the fixed start address of the user-space virtual memory. The BSS segment should not be forgotten (-set-section-flags .bss=alloc,load,contents).

Multiboot-compliant boot loaders like GNU GRUB, and also QEMU support the loading of a so-called initial RAM disk (initrd) in addition to the kernel image. The applications are to be packed all into one initrd, which we prefix with a header, in which the information about number and size of the following application binaries. The supplied tool imgbuilder.cc takes over this task; interpreting the initrd must then done at runtime (by you). The memory location and size of the loaded initrd is provided in the Multiboot information.

The format of the created meta-data looks similar to this:

struct InitrdHeader {
    uint32_t numApplications;
    uint32_t applicationSize[1023];
}

Page-Table Trees (Final)

At this point, you should have one page-table tree that covers the kernel's address space and have successfully enabled paging. For loading the applications, you have to create one page-table per application in the initial ramdisk. Again, you have to allocate and initialize a page directory and one or many page tables for this. For the kernel memory, you can and should share the page tables.

When populating the user's virtual address space, you must do the following:

  • Copy or move the pages from the initial ramdisk to the beginning of the user space (32 MiB)
  • The user-space stack should be read and writable and is well placed at the end of the 4 GiG address space.
  • If the dispatcher switches to another process (Dispatcher::go, Dispatcher::dispatch), we have to activate the mapping of the next process.

If everything works, processes can not access the memory areas of other processes nor can they access the kernel' address space. Syscalls and interrupts should work again and the kernel can access the memory of the currently running process without any problems.

Memory Management System Calls

In order to pass the possibilities of the memory management on to the user programs two more system calls are to be implemented. be implemented.

void* map(size_t size)
void exit()

The map system call maps enough pages to cover size bytes into the current address space. As you are not supposed to implement an unmap() operationm, you can use a simple per-process bumping pointer allocator for the virtual address space. The memory should initially be zeroed out. If the allocation fails, map should return a meaningful error code.

The exit system call terminates the current process and releases all associated associated resources. This system call shall in no case return to the application. For the validation of the memory release the state of the free memory management shall be printed. After all applications have terminated, the same number of kernel and user pages should be available at our allocators as where available before we created the first application.