StuBS
OS Develoment in General

So far, you have developed many programs that run on top of and use and operating system because we have used operating system services both consciously and nonconsciously. Besides system calls like mmap (for malloc) or write (for printf), we have also used implicit features of the operating system. For example, today we no longer have to worry about our program getting stuck and freezing the entire PC because the operating system has implemented process switching. There are also more isolation techniques and policies built into it.

In this course, however, we want to develop an operating system ourselves, which means that our code cannot use services of an underlying software layer, rather we have to build this layer ourselves first. This results in some changes compared to application development on an operating system, which will be explained here.

The programming environment, however, remains more or less identical, an IDE can still be used. Only the build system must be adapted when building the operating system, so that it results in a bootable system image. We use C++ for this course, although some other languages would be also possible, as long as they allow us to express all required low-level constructs that are possible in C(++).

Loading and booting the operating system

On Linux, the OS an ELF (Executable and Linking Format) file, loads certain parts of it into memory and starts executing our program in a separate process. This is done in such a way that other processes are not disturbed by our process.

For us, there is no operating system below our operating system, which could load our code. But, there is the bootloader, which can help us with that. The bootloader is a software layer that is loaded on some specific sectors of the hard disk by the BIOS (or UEFI). It has the task to load memory images of the operating system and jump there, that is, to start the execution of the operating system code. Modern boot loaders, like grub, even understand the an variant of the ELF format, whereby the OS developer job becomes much easier

Nevertheless, in C++, there are still some tasks that the OS code must do to initialize the C++ run time environment. For example, before starting the main() function, the constructors of all global objects must be executed so that the members have the correct values. In addition, the operating system still starts up the other cores, because up to this point only one core (the boot processor) of a multi-core system is active.

Dynamic memory management for the operating system

Ordinary applications can be developed in higher-level languages such as Go or Python. These languages hide many unsightly or annoying problems from the programmer, one of which is memory management. But where does the memory for a new object actually come from?

The Python interpreter, which is also just an application on e.g. Linux, asks the operating system to provide it with more memory. This could happen for example with the mmap or brk system call.

However, we want to develop an operating system on our own, i.e. there is no underlying software layer which can make memory available to us. Providing our own memory management for the operating system is a lot of work and is only done in operating system techniques.

Nevertheless, since we still requrie memory for our objects, the memory has come from somewhere. We have already found out earlier that the bootloader simply loads our created system image into the memory. So we can preallocate areas and fill them with content later at runtime. To do this, we declare some variables and objects globally, which the compiler then places in the data segment. These objects can then be accessed as usual. For example:

class Keyboard {
// Keyboard-Klasse hier
};
Keyboard kbrd;
int main () {
kbrd.plugin ();
}
Handles keystrokes.
Definition: keyboard.h:18
void plugin()
Initialization of the keyboard.
Definition: keyboard.cc:9
int main()
Kernels main function.
Definition: main.cc:71

Other objects can simply be placed on the stack. However, the stack is also pre-allocated and cannot grow and is therefore limited to 4KiB, i.e. we can only use very limited recursive functions. Objects or variables are automatically placed on the stack when they are declared within a function. There the variables are not cleared until the function returns. I.e. a pointer to an object lying in the stack should be handled with care after the function has returned.

int foo () {
// bar wird auf den Stack angelegt
int bar = 42;
foorbar ( &bar );
}

No C Library

The C library is a very basic library in the C world, which performs many simple tasks. For example it provides string functions or printf. The LibCs, which exist on operating systems, for example the GLibC, are based on the memory management of an operating system and certain system calls, like write. Without an operating system beneath us, there is no implementation for these core functionality.

In this course, there is no LibC and no standard template library (STL) of C++, but we only provide some basic functionalities which are all described in the documentation.

Debugging - or "Why did the CPU reset?"

Debugging is a very important part of program development, which can be easily done on an operating system using some tools like GDB or valgrind. It is an easy to use GDB to see the values of all variables used.

This is still possible in operating system development thanks to some good emulators with good debugging support, but debugging on real hardware is very time consuming. In addition it must always be counted on the fact that the genuine hardware behaves slightly differently than the emulator.

Without further ado, debugging on the real hardware is not possible. Theoretically a GDB stub could be implemented in software, e.g. controlled via a serial interface, but this is not done in this course. So for the real hardware it is only possible to control the outputs in a meaningful way, and to draw conclusions from that, why the processor resets, or why processes get stuck.

No isolation of program processes

Under Linux all processes are isolated from all other processes, i.e. a process cannot cause another to jam and stop making progress (apart from interactions between processes, communication, etc). This also allows multiple cores to be used by different processes without troubles.

However, when we write an operating system ourselves, this isolation does not exist yet. So it can easily happen that one core changes a data structure while another wants to access it and inconsistencies must occur. This means that we must distribute locks in the operating system well-considered, in order to exclude that several cores get in the way.

In addition, our program flow can be constantly interrupted by device interrupts. This is irrelevant for an application on Linux, because Linux handles these interrupts transparently and returns to the application. However, Linux must also be careful not to destroy any data structures in the interrupt that are (could be) in use.