Lua as a General-Purpose Extension Language in Linker Scripts for Embedded Systems

After preprocessing, compilation, and assembler invocation, the linker (e.g. GNU ld or llvm-lld) collects all object files and binds them together into a single executable file. During this process, the linker extracts the sections (e.g. function or variable sections) from each object file, and filters and arranges them into segments, before dumping them to the final executable.

SECTIONS {
  . = 0x10000; // At this virtual address ...
  .text : { *(.text) } // ... place the concatenation of the .text sections of all inputs.
  . = 0x8000000; // At this virtual address ...
  .data : { *(.data) } // ... place the concatenation of the .data sections of all inputs.
  .bss : { *(.bss) } // At the next available virtual address, place the .bss sections.
}

This process, the filtering and the arrangement, can be controlled via the linkers "Command Language". Programs written in this language, which is an obscure, implementation-defined domain-specific language, are unexpressive and a pain to write (for things more complex than the example above). While linker script is mainly used in embedded systems, where the memory layout most peculiar, the obscurity of linker script make hard to experiment with link-time methods in other contexts as well.

As a way out of this misery, this thesis should explore the usage of Lua as an embedded scripting language. For this, the student should extend the linker script language to allow for Lua snippets in various places, and the LLVM Linker lld's interpretation/processing of those scripts, by integrating calls to Lua fro the appropriate places in the existing C++ code. An example linker script with Lua snippets (@) could look like this:

SECTIONS {
  . = @{ return 1 << 16; }@; Get a value from a lua lambda.
  .text : { *( @{ (sec) return sec.name == ".text" }@ ) } // Filter all input sections through a lambda.
  . = @<./utils.lua:getAddr>@; // Get a value by calling a function exported by an external lua script.
  .data : { @{ return filter(function(sec) return sec.name == ".data" end, inSections) }@ } // Get sections from a lambda.
  @{ // Have a top-level function ...
    symbols.dot = 0xbeef; // ... define new or assign to symbols
    outSections.foo = Section.new(...); // ... create sections, etc.
  }@
  .bss : { *(.bss) } // And mix the above with normal linker script instructions.
}