StuBS
A1: System Calls in StuBSmI

The goal of this exercise is to extend the OOStuBS known from the Operating systems construction lecture OOStuBS by introducing system calls to separate the privileges of kernel and userspace. This is the first step on the way to StuBSmI (Stu denten - B etriebs S ystem m it I solation). As starting point, a slightly adapted version of OOStuBS is provided to you.

Moving the application to ring 3

In the first step, the system should be adapted so that the code of the applications is always executed on ring 3 and only the handling of interrupts (especially time-slice scheduling interrupts) takes place on the privileged ring 0. Only in the second step, an interface for system calls is introduced, which allows for an synchronous entry into the kernel to execute privileged operations. privileged operations.

Extend the Global Descriptor Table

Up to now OOStuBS, uses a Global Descriptor Table (GDT) for the operation in the Protected Mode, which describes in two entries the code and data segments for ring 0. For operation in ring 3, two further entries have to be created, which allow the same access from ring from ring 3. Another new entry forms the TSS descriptor (Task State Segment), which essentially controls the value to which the stack pointer is set as soon as an interrupt triggers a switch to ring 0. The structures of these descriptors are described in the third volume of the three-part IA-32 Developer's Guide in the sections "Segment Descriptors" (3.4.5) and "Task Management Data Structures" (7.2.2).

Introduce Kernel stacks

The ring-0 code should execute, for each application, on a separate kernel stack that is separate from the normal application stack. In addition to the corresponding extension of the Thread class, the Dispatcher class, which controls the switching between threads, must also be adapted to ensure that the TSS segment points to the kernel stack of the next thread.

Initial dispatch to ring 3

Everytime, we switch from the kernel to the user-space, we have to drop our privileges such that the user cannot monopolize the CPU and the OS always take back control (e.g., via the timer interrupt). While the hardware helps us with the switch from and to ring 3 when we switch threads via (timer) interrupts, we have to take special provisions when we start a new thread:

When dispatching to a new thread, we have to leave ring 0. For this, you have to extend the prepareContext() method. Originally, this method does prepare a thread context as if the thread ran before and was just preempted by the kernel. It fakes the thread control block and the stack such that it calls the Thread::kickoff() function.

Instead of calling the virtual Thread::action() method, you have to perform the jump to ring 3 by preparing an fake stack that looks like it was created by an hardware interrupt that jumped from ring 3 to ring 0. With this faked stack, you invoke the iret instruction, which reverts this privilege increase and thereby brings us to ring 3. This iret should jump to a newly introduced kickoff_user() trampoline function, which always executes on ring 3 and invokes the Thread::action() method.

An description of the faked interrupt stack can be found in the Intel handbook under "Exception and Interrupt Handling" (6.12). Besides that, you also have to set the Segment-Registers to the correct user-space segments. Passing parameters to kickoff_user(), can be done by pushing it onto the user-space thread as prepareContext() does it for Thread::kickoff().

System-Call Interface

Now that we have left ring 0 successfully (can be confirmed with gdb and monitor info registers), we have to open up a way for the application to execute operations in a secure manner on ring 3. For this, we provide an synchroneous path from ring 3 to ring 0. On top of this, we build our system-call interface.

Exception handling

While interrupts are asynchroneous, x86 CPUs give us the possibility to trigger traps from software with the int instruction. However, triggering an interrupt is usually a privileged operation as the user could tick the system time very very fast by triggering the timer interrupt in a while() { int $timer_irq; } loop very very fast. Therefore, is x86's int instruction a privileged operation that, if invoked on ring 3, would provoke a General Protection Fault.

However, by manipulating an entry in the Interrupt Descriptor Table, we can allow the int instruction for individual interrupt vectors (e.g., for our system-call trap number) from user space. The format of those descriptors in explained in the Intel manual in section 6.12.

As the system-call trap is not triggered by an external device, we must adapt the system-entrace path for the chosen interrupt vector In interrupt/handler.asm, you have to save all registers (caller- and callee saved) in a well-defined order such that we can access the system-call parameters in our C++ code. For this, you should extend the CPU context structure. (InterruptContext). With this adaption, you can plug in your system call path into the interrupt_handler() function.

Passing parameters

Since a system call provokes a switch to the kernel stack, the user cannot pass parameters on the (user) stack. Instead, we have to provide stub functions on the user side that load the system-call parameters into the CPU registers in order to pass them, over the privilege switch, onto ring 0. On the other side, the interrupt_entry function must store the those registers on the kernel stack to make them accesible for C/C++ land. The system-call dispatcher uses one argument to determine the system call (e.g., system-call number in eax).

The following system calls are to be implemented by you. The conrete semantic can be adapted in a meaningful manner.

size_t write(int fd, const void *buf, size_t len, int x = -1, int y = -1)
size_t read(int fd, void *buf, size_t len)
void sleep(int ms)
int sem_init(int semid, int value)
void sem_destroy(int semid)
void sem_wait(int semid)
void sem_signal(int semid)

It is a good idea to hide the write syscall behind an OutputStream compatible wrapper. For easier debugging, it is recommended to create CPP macros for assertions and kernel panics, which can show the error location using __LINE__, __FILE__ and __func__ variables.