The cat on the tip of the iceberg

☃️
Git-Repository: Template Solution Solution-Diff (Solution is posted at 18:00 CET)
Workload: 46 lines of code
Important System-Calls: open(2), pread(2), close(2)

Welcome to the Christmas village, where the ELFs (see elf(5)) live and work together to prepare everything for Christmas. While they are Santa's little helpers, they have a rich culture on their own. They are open to new technologies, but also have a rather strong technocratic culture that might sometimes irritate you. Their favorite operating system is of course Linux and they are also into low-level systems hacking to squeeze out the last bit of performance from their machines. While cooling is no problem at the north pole, getting enough energy from renewables is hard up there. Also, they often find themselves in a NIH (Not Invented Here) situation, where they solve problems that others have already solved. We are guests in the village for the next 24 days, so this is a good opportunity to get some in depth knowledge about Linux and its specifics. Come and join me in our stay at the Christmas village where the ELFs like invoking system calls as much as glueing together gifts for the children of the world.

Introduction

System calls are commands, like read! or write! for our operating system. They are verbs and they are an imperative. But it is not enough to order an action, we also have to specify on what object the action should be performed. And while integers or strings make good arguments, we often have to address some stateful object within the operating system. I am talking of file descriptors (FDs).

First of all, I have to tell you something. But it has to be in secret... so come closer... closer. OK! Now it's good. File descriptors are not (only) about files anymore. They are a much more general concept that we use to address kernel objects from user space. On the surface, for our program, they are just integers. For example, have you already heard that 1 is usually the standard output of a program? But what happens, if we use 1 with, for example, write(2)?

write(1, "foo", 3)

When the OS receives those three arguments, it will consult the file-descriptor table (struct fdtable) of the current process and use the given integer as an index to retrieve a struct file. However, you might already have guessed it, these objects are not necessarily only open files, but they can represent all sorts of in-kernel objects that come with a struct file interface. For example, like our standard output, they can be a terminal or, as we will see later, point to another process.

You have to imagine file descriptors as a handle in user space that we can grab, store and pass around in our program, while there is an in-kernel object connected to it. Or, to say it differently, the file descriptor is just the tip of the iceberg.

Task

We take a soft start in our Advent(2), and today's task is to write a simple version of the program cat(1). cat takes N file names as command-line arguments, concatenates their contents, and writes the result to its standard output.

For this task, it is not necessary to use malloc(3), but you can get away with a globally allocated char buffer[4096] array.

System Calls

  • open(2): The open system call takes a filename and some flags and opens the file. Thereby, it creates a new struct file in the kernel, finds a free slot in the struct fdtable and gives you the positive index. If the operation fails, you will get a negative integer.
  • read(2): Reads data, up to count bytes, from the file into the given buffer. Thereby, the kernel remembers the "current position" of the open file in the struct file (f_pos). If you want to read at an arbitrary position, you should use pread(2) or reposition the file pointer with lseek(2).

  • close(2): Removes the fdtable entry and deallocates the struct file if we were the last owner. Quiz: Can multiple entries of fdtable point to the same struct file? (hint: yes).