i ELF FILE – CHAPTER 3: DYNAMIC LINKER AND SOURCE CODE PROTECTION – All things in moderation

ELF FILE – CHAPTER 3: DYNAMIC LINKER AND SOURCE CODE PROTECTION

Continue to research about ELF. Today I will analyze dynamic linker. Then, I will introduce some tools like obfuscator tool for ELF file.

1. DYNAMIC LINKER:

1.1. The auxiliary vector (auxv):

When a program gets loaded into memory by the sys_execve() syscall, the executable is mapped in and given a stack (among other things). The stack for that process address space is set up in a very specific way to pass information to the dynamic linker. This particular setup and arrangement of information are known as the auxiliary vector or auxv. The bottom of the stack (which is its highest memory address since the stack grows down on x86 architecture) is loaded with the following information:

ELF auxiliary vectors are a mechanism to transfer certain kernel level information to the user processes. An example of such an information is the pointer to the system call entry point in the memory (AT_SYSINFO); this information is dynamic in nature and is only known after the kernel has finished uploading.
The information is passed on to the user processes by binary loaders which are part of the kernel subsystem itself; either built-in the kernel or a kernel module. Binary loaders convert a binary file, a program, into a process on the system. There is a different loader for each binary format; thankfully there are not many binary formats – most of the linux based systems now use ELF binaries. ELF binary loader is defined in the following file /usr/src/linux/fs/binfmt_elf.c.
The ELF loader parses the ELF file, maps the various program segments in the memory, sets up the entry point and initializes the process stack. It puts ELF auxiliary vectors on the process stack along with other information like argc, argv, envp. After initialization, a process’ stack looks something like this:

ELF loader puts an array (auxv) of ELF auxiliary vectors at the bottom of the stack. The structure of an auxiliary vector is defined in /usr/include/elf.h as:

a_type defines the entry type and union a_un defines the entry value. Legal values for a_type are defined in elf.h. To give you an idea, here are some of the vectors:

The ELF auxiliary vectors being passed to program can be seen by setting environment variable LD_SHOW_AUXV to 1.

The getauxval() function retrieves values from the auxiliary vector, a mechanism that the kernel’s ELF binary loader uses to pass certain information to userspace when a program is executed. On success, getauxval() returns the value corresponding to type. If type is not found, 0 is returned.

The value of type have changed the value after each load is AT_SYSINFO, AT_SYSINFO_EHDR, AT_BASE, AT_RANDOM.

1.2. The ld-linux.so.* :

Most modern programs are dynamically linked. When a dynamically linked application is loaded by the operating system, it must locate and load the dynamic libraries it needs for execution. On linux, that job is handled by ld-linux.so.2. You can see the libraries used by a given application with the ldd command:

When ls is loaded, the OS passes control to ld-linux.so.2 instead of a normal entry point of the application. ld-linux.so.2 searches for and loads the unresolved libraries, and then it passes control to the application starting point.
The ld-linux.so.2 man page gives a high-level overview of the dynamic linker. It is the runtime component for the linker (ld) which locates and loads into memory the dynamic libraries used by the application. Normally the dynamic linker is implicitly specified during the link. The ELF specification provides the functionality for dynamic linking. GCC includes a special ELF program header called INTERP, which has a p_type of PT_INTERP. This header specifies the path to the interpreter.
The ELF specification requires that if a PT_INTERP section is present, the OS must create a process image of the of the interpreter’s file segments, instead of the application’s. Control is then past to the interpreter, which is responsible for loading the dynamic libraries. The spec offers some amount of flexibility in how control may be given. For x86/Linux, the argument passed to the dynamic loader is a pointer to a mmap’d section.
The ld-linux.so is statically linked and it’s doesn’t have an .interp section which all dynamically linked object must have. ld-linux.so doesn’t depends any other libraries. It is runable by itself when loaded to memory.
We can see the dynamic linker to be used by checking the program headers.

Check the ld-linux.so.2, we are led to /lib32/ld-2.27.so . This is the original dynamic linker.
At the previous section, the value of AT_BASE in the auxiliary vector is base address of interpreter. I load the program and argument to the edb debugger in Linux by a command:

$ edb --run hw thisisargv

In Stack window, the address of argv is 0xffb277a8.

Based on the structure presented, the start address of the auxiliary vector is 0xffb2785c. Because the AT_BASE type is 7, value of base address of interpreter( ld-2.27.so) is 0xf7fcd000.

Seeing Memory Regions window and Data Dump window, the dynamic linker( ld-2.27.so) is load in memory like an ELF file and at 0xf7fcd000.

Show relocation tables of ld-2.27.so:

The type of entry relocation is R_386_RELATIVE (Calculation: B + A ; B is base address of ld-2.27.so). So, the kernel easy hand the relocation process. This ensures that the dynamic linker executes without loaded at any address in memory.

2. SOURCE CODE PROTECTION:

Obfuscating code or hiding source code in Linux is a sensitive issue. This seems to contradict the notion of open source in Linux. Therefore, information on this is very limited. Besides that, anything that can be loaded into a PC can be cracked. The people that do reverse engineering for fun, profit or fame are generally very good at it and will really not be the least bit phased by anything you do to try and stop them. The amount of effort you put it in just restricts how many can reverse it. I will introduce two free tools to doing this: UPX, ELF-Packer.

2.1. UPX tool:

UPX (Ultimate Packer for Executables) is a free and open source executable packer support a number of file formats from different operating systems. It achieves an excellent compression ratio and offers fast decompression. Your executables suffer no memory overhead or other drawbacks for most of the formats supported, because of in-place decompression.
In addition to the main features is packing, you can use UPX to make the Reverse process is hard.
Let’s see UPX structure:
* Prologue: CMP/JNZ for DLLs parameter checks; Pushad, set register; optional NOP alignment.
* Decompression algorithm: whether it’s NRV or LZMA.
* Call/ Jumps restoring: UPX transform relative calls and jumps into absolute ones, to improve compression.
* Imports: load libraries, resolve APIs.
* Reset section flags.
* Epilogure: clean stack, jump to the original EntryPoint.

Install UPX:

$ sudo apt-get update
$ sudo apt-get install upx-ucl

2.2. ELF-Packer:

This project is a super simple polymorphic runtime cryptor for x86_64 ELF binaries on linux. It is written in Python3. The author wrote this for the extended digital forensics course at his university. So, this is simply a small tool.
The script will search for a region of nulls in the provided binary that is large enough to fit the assembly stub – which it will place in this region of nulls. The entry point in the ELF header is then changed to the start of this stub, such that the stub is the first thing that is executed when the binary is run. Once the stub has completed execution, there is an absolute jump to the original entry point for the binary to continue ordinary execution.
The stub is a simple XOR encryptor for the .text section of the binary, which does a byte by byte xor to make disassembling it impossible – although trivial to bypass as it is only a simple XOR.
The stub however will (by making a new file and some trickery due to the pesky ETXTBSY unix error) modify the executable on every execution to use a new random XOR byte such that the hash of the binary will change on each execution – hence polymorphic.
There are some issues with which you can improve:
* Only support for 64 bit binaries.
* Too simple “encryption”.
* Requires a sufficient amount of NULL bytes in the binary (I have to compile binaries with -static to have a sufficient amount).

Install python3-pwntools:

$ apt-get update
$ apt-get install python3 python3-dev python3-pip git
$ pip3 install --upgrade git+https://github.com/arthaud/python3-pwntools.git

To get source code and use ELF-Packer:

$ git clone https://github.com/dzonerzy/ELF-Packer
$ cd ELF-Packer
$ python3 elf_cryptor.py -b 64 -f file

CONCLUSION:

I have analyzed dynamic linker and presented some tools for protecting source code. If you need support, you can comment below. Thanks!

Leave a Reply