Last Christmas I Gave you my Letter
Workload: 139 lines of code
Important System-Calls: getdents64(2), statx(2)
Recommended Reads:
- Dec. 1: The cat on the tip of the iceberg file 45 lines [open(2), pread(2), close(2)]
Oh no, look at all those letters the children have written, now that
they have so many options to deliver them. There are even many that
really cramped their wish list into 32-bit and used the sigqueue
path. But initially, there was a configuration error and this year's
wish lists were mixed with those wish lists from last year. And now we
have a big pile of unsorted letters. Can you help the ELFs to find
those letters that were created between last year's Christmas and now?
Find last year's letters
Have you ever wondered how the first Unix command line tool that you were taught is implemented.
How does ls(1) work?
In your basic OS course you might have used opendir(3) and readdir(3), but as we already see, from the section they are in, those are convenience wrappers around actual system calls.
With a short look at the musl source code of readdir
[musl: src/dirent/readdir.c], we see that the DIR *
handle has an internal buffer that is regularly filled with getdents(2).
It is also this buffer that makes readdir
on the same DIR *
object not thread save and justifies the existence of readdir_r(3) [musl: src/dirent/readdir_r.c].
Please have a look at the discussion there when to use it.
So, how do we use getdents
and its modern 64-bit--safe variant getdents64(2)? Somewhere we have to get a file descriptor for the directory in our hands. But wait! A file descriptor for a directory - isn't that a contradiction by itself? Aren't files and directory two different concepts? Yes and no. In the file system, a directory is a special kind of file (with the d
flag) that contains a list of directory entries, which map the symbol names to inode(7) number. On older Unixes, you could even use cat
to read those entries.
But for our task, we have another problem: When was a file "born" and what does it event mean to give birth to a file. From your intense read of the stat(2) man page, you might have heard that Unix saves three times for each file:
st_atim
: Time of last access (=last open call)st_mtime
: Time of last modification (=file data was changed)st_ctime
: Time of last status change (=last chmod)
For a detailed discussion of these, you can stop by at geeksforgeeks.
But still, there is no birth time..
Well, birth time is something that Linux file system's learned from NTFS's creation time.
However, the creation time could not be retrofitted into the interface of stat()
, whereby a new system call became necessary: statx(2)! I wonder why they did not choose stat2
or stat64
? Wait, there is a stat64(2)...... Nevermind, thank you Linux.
On the command line, you can get a file's creation time with stat(1):
$ stat letters.c File: letters.c Size: 7026 Blocks: 16 IO Block: 4096 regular file Device: 10304h/66308d Inode: 44308191 Links: 1 Access: (0644/-rw-r--r--) Uid: (10104/stettberger) Gid: (10150/stettberger) Access: 2022-10-06 13:32:19.997267197 +0200 Modify: 2022-07-25 16:41:47.490375295 +0200 Change: 2022-07-25 16:41:47.490375295 +0200 Birth: 2022-07-25 16:37:27.588870684 +0200
So as you see, I've created today's exercise in July 2022. Interestingly, there is no system call to manipulate the birth time of a file, whereby it becomes an interesting property for IT forensics.
Task
Complete the letters
tool, which uses getdents64(2) to
iterate over all files in a directory. It should filter out all files
that were "born" after last year's Christmas. You can extract a
file's birth time with statx(2). Additionally, you should
calculate the number of days between the letter's birth time and the
next Christmas and sort the output from the oldest to the newest
letter. An example output of the letters tool could look like this:
$ ./letters 151 days: Makefile 151 days: index.md 11 days: letters 11 days: letters.c
Hints
- You have to request the birth time with
STATX_BTIME
at thestatx()
call; it has to be explicitly requested. - Directories have to be opened with
O_DIRECTORY
. - Place the filtered letters in a dynamically growing array realloc(3) and sort them with qsort(3)