The other day at work, I was looking at a shared object file library that could be controlled and was wondering what the best way to gain execution using it would be. The main constraint was that all of the normal functionality of the library needs to be maintained.

I learned something interesting while researching this subject. A shared object will generally be mapped into a process’ memory space via mmap. An mmap mapping will sometimes update memory contents when the underlying file has changed on disk.

Looking at man 2 mmap, the docs specifically say

It is unspecified whether changes made to the file after the mmap() call are
visible in the mapped region.

That is specifically for a mapping created with the MAP_PRIVATE flag.

On Pop!_OS 22.04 LTS, the mapping will be updated when the shared object file is changed on disk. I tested this with the following two files.

// thing.c

#include <unistd.h>

extern int function(void);

int main() {
  while (1) {
    function();
    sleep(1);
  }
}
// test-lib.c

#include <stdio.h>

static int call_count = 0;

int function(void) {
  call_count++;
  printf("Called %d times\n", call_count);
  return call_count;
}
clang -shared -fPIC -Og -ggdb3 -o libtest.so test-lib.c
clang -o test -Og -L/tmp -ltest thing.c -Wl,-rpath=/tmp

Running ./test and modifying libtest.so will cause './test' terminated by signal SIGBUS (Misaligned address error). This must be because the data in memory is also getting changed and is not just limited to the file on disk.

I can’t find the reference I was looking at earlier, but I remember reading somewhere that this behavior will only be seen before any data has been copied on write. Any of the mapped pages that have been written to will be copied by linux and those changes will not be reflected in the file. Similarly, those pages will not get updated if the underlying file has been written to.

That means that the code section and constant data sections of a file would reflect the changes to the backing file while the sections containing writable data will stay the same. Those writable sections will most likely have been written to before the underlying file could be written to and would have been copied on write.

You could use a file write to write an updated shared object that has different code in it that would be run the next time a function from the shared object is used. It might also be possible to update a vtable pointer or some kind of function pointer value in the shared object to point somewhere else to gain execution.

Successfully executing an attack like that would be very difficult in practice. You wouldn’t necessarily know when shared object code is being executed and when it is actually safe to update the code. You don’t want the processor to see a half updated instruction or execute one new instruction and then an old one and crash. Additionally, modifying the shared object in that way would remove existing functionality that might be relied upon. Since the library has already been mapped into the process’ address space, theer are some number of pages that it takes up and you can’t expand that by just writing to the backing file. So there are only so many bytes that can be used and you might not be able to find a new location for overwritten instructions to go.

Injecting a new Initializer

While thinking about this problem, I thought that the nice way of using the ability to overwrite a shared object would be to just add a new initializer function that would get run when the shared object is loaded.

For anyone that doesn’t know about the internals of shared objects, there are some number of intialization functions that will get run when a shared object is loaded. These functions will generally initialize any static data structures so that they can be used immediately when a function is called. These initialization functions are important for libraryes like libc which will use them to do things like initialize the heap so that malloc doesn’t need to worry about that the first time it is called.

Initialization functions will have a pointer to them added in a special section which the dynamic linker will loop through when loading a library to call each function. On an x86_64 gnu based system, that section will be called .init_array. That is a section defined by the linker that has pointers to each function that needs to be called once the library is loaded.

Using lief I was able to modify a shared object to inject a new constructor into a shared library that would be executed on load. lief is a really easy to use python library for analyzing and modifying ELF binaries and other executable formats. Using lief, you can add a new segment to an existing shared object library and add overwrite an existing initialization function pointer to point to the new function and then have the new function call the original function.

I played around with trying to add a new entry to the .init_array section but that did not go well. That part of the ELF binary is a section and not a segment. So it is just a specific part of a load segmentin the ELF binary. Adding data to that segment would require moving all of the other data and references and relocations around in the rest of the segment which is not something that can really be done portably or a way that’s guaranteed to work. Instead, it’s best to just overwrite one of the existing entries and then you don’t have to worry about fixing up the rest of the segment.

Some things to keep in mind when writing the actual injection payload is that the code may be running before anything in the system has been properly set up. Especially if you’re injecting into libc, some of the functionality in libc may rely on the initialization functions having run already. Since our code is running before all of those have run, it’s dangerous to use any libc functionality. The only things that are guaranteed to work is code you’ve brought yourself and raw syscalls. You also need to make sure that all of your code is relocatable. A shared object can get loaded at any address so no absolute addresses can be used and everything needs to be relative to where the code is actually loaded. That’s pretty easy to do on x86_64 with its PC relative addressing but a bit more difficult on x86 since it doesn’t have that functionality.

Check out my basic implementation of the above strategy here.