A Trace of Cookie Crumbs

☃️
Git-Repository: Template Solution Solution-Diff (Solution is posted at 18:00 CET)
Workload: 93 lines of code
Important System-Calls: ptrace(2)
Recommended Reads:

Somehow, the scheme to observe ELFs with problematic behaviors went not as well as initially expected. An ELF got away and, enraged by the installed surveillance system, "borrowed" all cookies from cookie-gift storage and ran away. Without cookies, this year's Christmas eve would be very disappointing for children and grown-ups alike. Therefore, we have to find the malicious ELF and bring back all the cookies. Luckily, the ELF, hungry from all the fleeing, eats one of the cookies from time to time and leaves some cookie crumbs for us to find. We only have to follow this trace and everything will be fine.

Process Trace

One core functionality of operating systems is to isolate processes from each other such that one process cannot directly control the behavior of the other process. With inter-process communication primitives (see Santa's Postbox), the strict inter-process isolation can be torn down in a controlled and limited manner. With IPC one process can send other processes messages or communicate via shared memory. Actually, message passing and shared memory are each others dual principle and your very-much-loved shared-memory multi-core system uses a message-passing--based mechanism for it's cache-coherence.

However, sometimes this is not enough and we really have to have full control over another process. The canonical use case for this is debugging. The debugger must completely control the debugged process and must be able to stop it, continue it, read and write its memory and its registers, and even intercept the issued system calls. Only on the base of these operations, high-level concepts like breakpoints and watchpoints can be implemented. To enable this the OS must provide special interfaces with strict access control, to give one process control over another one. On Unix, this interface is ptrace(2).

With ptrace(), the "tracer" process gets control over the "tracee", manipulates its state, and becomes able to wait for events (e.g., the next system call). To provide some security safeguards, the tracee must first agree to be traced with PTRACE_TRACEME (unless the tracer has the general ptrace capability). But when should this approval happen? The answer is the twilight zone between fork() and exec:

pid_t tracee;
if ((tracee = fork()) == 0) {
    // In child: twilight-zone starts
    ptrace(PTRACE_TRACEME,....)
    execvp(....);
    // Never reached
}
// debug the tracee

For more details, I refer you to ptrace(2)

strace

An standard tool, which you have already extensively used in the Advent(2) is strace(1) (system-call trace). With strace, we can record all system calls a process issues. For example:

$ strace echo
execve("/bin/echo", ["echo"], 0x7ffea5aadfc0 /* 82 vars */) = 0
[... dynamic loader and libc initialization ..]
write(1, "\n", 1)                       = 1
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

But how is strace implemented? As it turns out, we can use strace to strace strace:

$ strace strace echo
execve("/usr/bin/strace", ["strace", "echo"], 0x7ffc7ab5a228 /* 82 vars */) = 0
[... initialization ...]
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f44abefca10) = 818523
wait4(818523, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], 0, NULL) = 818523
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=818523, si_uid=10104, si_status=SIGSTOP, si_utime=0, si_stime=0} ---
ptrace(PTRACE_SETOPTIONS, 818523, NULL, PTRACE_O_TRACESYSGOOD) = 0
ptrace(PTRACE_GET_SYSCALL_INFO, 818523, 88, {op=PTRACE_SYSCALL_INFO_NONE, arch=AUDIT_ARCH_X86_64, instruction_pointer=0x7f44abc3dd17, stack_pointer=0x7ffd124219a8}) = 24
[... wait for the next syscall ...]
ptrace(PTRACE_SYSCALL, 818523, NULL, 0) = 0

Not very much surprising, strace is implemented on top of ptrace(): It starts a child and uses PTRACE_O_TRACESYSGOOD and PTRACE_SYSCALL to wait for issued system calls. On every system call, the child is blocked and the ptrace(PTRACE_SYSCALL...) system-call unblocks the tracer. strace then prints the system call name and arguments and then continues the child. Actually, this happens twice per system call: Once before the syscall is issued, to give the tracer the chance to manipulate the system call arguments, and when the syscall returns.

It is today's task, that you write your own simplified version of strace.

Hints