i ELF FILE – CHAPTER 1: THE FILE FORMAT AND SOME TOOLS FOR ANALYSING – All things in moderation

ELF FILE – CHAPTER 1: THE FILE FORMAT AND SOME TOOLS FOR ANALYSING

Hi, my friends. Today, I present about the ELF (Executable and Linking Format) format and introduce some tools. There are three main types of file: Relocatable file, Executable file, Shared object file. Use gcc -no-pie to make an ELF executable file. The ELF file usually has four main components: ELF header, program header table, sections, section header table.

File Format

1. ELF header:

The ELF header is located the beginning of the file. The first field in the ELF header struct is e_ident field and it marks the file as an ELF file, provides machine-independent data with which to decode and interpret the file’s contents. The e_entry gives the virtual address to which the system first transfers control ( entry point). The e_type identifies the object file type. Type ET_CORE is reserved to mark the file having unspecified contents. Values from ET_LOPROC through ET_HIPROC are reserved for processor-specific semantics.

In addition, the ELF header provides information for other components ( program header table, section header table). In order to view the initial ELF file header, we can use the command readelf -h

2. Program header table:

Program header table is an array of structures, each describing a segment or the necessary information for program loading. The program header table can be accessed by referencing the offset found in the initial ELF header (e_phoff field). The p_vaddr gives the virtual address at which the first byte of the segment resides in memory. About p_paddr, On systems for which physical addressing is relevant, this member is reserved for the segment’s physical address. Because System V ignores physical addressing for application programs, this member has unspecified contents for executable files and shared objects. Loadable process segments must have congruent values for p_vaddr and p_offset, modulo the page size. This member gives the value to which the segments are aligned in memory and in the file. Values 0 and 1 mean no alignment is required. Otherwise, p_align should be a positive, integral power of 2, and p_vaddr should equal p_offset, modulo p_align. Program headers are primarily there to describe the layout of a program for when it is executing and in memory. We can use the readelf -l command to view a file’s program header table:

3. Sections:

A section is different from a segment. Segments are necessary for program execution and aligned in memory. Each segment is code is code or data divided up into sections. Each section contains either code or data of some type. Next, I will present some special section:
* The .text section: is a code section that contains program code instructions.
* The .rodata section: contains read-only data such as strings from a line of C code.
* The .plt section: contains code necessary for the dynamic linker to call functions that are imported from shared libraries.
* The .data section: will exist within the data segment and contain data such as initialized global variables.
* The .bss section: contains uninitialized global data.
* The .got section: contains the global offset table.
* The .dynsym section: contains dynamic symbol information imported from shared libraries.
* The .dynstr section: contains the string table for dynamic symbols that have the name of each symbol in a series of null-terminated strings.
* The .rel. * section: contain information about how parts of an ELF object or process image need to be fixed up or modified at linking or runtime.
* The .hash section: contains a hash table for symbol lookup.
* The .symtab section: contains symbol information of type Elf_NSym.
* The .strtab section contains the symbol string table that is referenced by the st_name entries within the ElfN_Sym structs of .symtab.
* The .shstrtab section contains the section header string table.

4. Section header table:

The section header table is an array of structures as described above. A section header table exists to reference the location and size of these sections and is primarily for linking and debugging purposes. Section headers are not necessary for program execution, and a program will execute without having a section header table. The readelf -S command will show the file’s section header table and the readelf -l command will show which sections are mapped to which segments.

5. The tools for analysting:

  • 010 Editor:
    This is a useful software. It provides GUI tool to work with ELF file. Unlike traditional hex editors which only display the raw hex bytes of a file, 010 Editor can also parse a file into a hierarchical structure using a Binary Template.
    Download 010 Editor: https://www.sweetscape.com/010editor/
    Download Templates: https://www.sweetscape.com/010editor/repository/templates/

  • readelf:
    The readelf command is a the useful tool that provides every bit of the data specific to ELF necessary for gathering information about an object before reverse engineering it.

  • ftrace:
    The ftrace is similar to ltrace, but it also shows calls to functions within the binary itself.

$ git clone https://github.com/elfmaster/ftrace
$ cd ftrace/
$ gcc ftrace.c -o ftrace
  • ERESI:
    ERESI contains a suite of many tools for analysing a Linux binary, which is capable of code injection. Unfortunately, many of tools are not kept up to date and aren’t fully compatible with 64-bit Linux. You can install ERESI follow the below instruction.
$ git clone https://github.com/thorkill/eresi
$ sudo apt-get install libpcap-dev libssl-dev
$ ./ersei/configure --enable-32-64
$ make && make install

Conclusion:

In this chapter, I present the basic format in an ELF binary file and introduce the analytic tool. The next chapter, we will explore the program loading process and the dynamic linking.

Leave a Reply