Skip to main content

An analysis on ELF files

During my journey towards the reverse engineering of boot binary, I tried a lot of way to disassemble a boot binary. Generally  hackers use tools like IDA pro. But this tools comes with a cost (IDA pro costs 1000 Dollar for single user license). Freeware version of IDA pro is available (for non-commercial use only), but this comes with limited disassembler functionality (like ARMv7 & ARMv8 not supported).
A lot of open sourced tools like Ghidra (from NSA) is present, but it has very much limited functionality.

So, I finally decided to disassemble a flat binary from my own (may be using GNU). But it is not a straight forward task. When we build a boot binary (let say OPTEE or bootloader or ATF), at first a Executable & Linkable File is created then using objcopy tool a flat binary file is finally produced which goes to the ROM. In the continued series of this post, we will crack from scratch what is the significance of ELF, its various sections, why ELF is not flashed in ROM and instead why flat binary files are used, and finally we will try to disassemble some small boot binary.

So Let's start with a normal HELLO WORLD program (let's say helloworld.c), we will build using gcc.

#include <stdio.h>

int main()
{
    printf("HELLO WORLD\n");
    return 0;
}

Then we build using the command (gcc helloworld.c -o outfile). The output will generate an ELF file. The objective of this post to analyze this ELF file.

So, in this experiment we will be using these tools: readelf, objdump, xxd

An ELF file contain three main type of headers:

  1. File Header/ELF header: Contain the top level information like for which machine the ELF is build for, Endianess of binary, class of binary (ELF64/32), start address of Program headers & the section headers. readelf -h outfile command is used to extract the ELF header.
  2. Program Header: It tells the system about how to create the process. readelf -l outfile command is used to extract the program header. In my outfile there were 9 program header. The first one of type PHDR is used to represent the program header segment itself. Now each program header represent each segment. Some are of type LOAD (loadable segment) that tells the system about all the loadable data like .text, .rodata, .bss, .data etc while creating the process. 
     
  3. Section Header: When a program is build, the object file contain lot of sections like .texr, .rodata, .bss, data, .symtab, etc. The section header depict each one of these sections. One very interesting section that I explored among them is the strtab section that contain all the section name in concatenated form. I guess the while loading the binary, the linux kernel first check and ensures that some particular section (let say .text) is present in the ELF or not. readelf -S outfile command is to be used to get the section header.


For reference the ELF wiki link gives a very good explanation about the ELF files.
Also this NPTEL youtube video explains a lot about ELF headers.
Maybe next we will explore something more on reverse engineering some helloworld program using the readelf, objdump and xxd tools.

Comments

  1. I like your all post. You have done really good work. Thank you for the information you provide, it helped me a lot. I hope to have many more entries or so from you.
    Very interesting blog.
    crackpur.info
    IDA Pro Crack

    ReplyDelete
  2. I like your all post. You have done really good work. Thank you for the information you provide, it helped me a lot. I hope to have many more entries or so from you.
    Very interesting blog.
    crackpur.info
    IDA Pro Crack

    ReplyDelete

Post a Comment

Popular posts from this blog

ARM Trustzone - An overview on how SMC calls are handled by the EL3 Monitor

In this write up, we will focus mainly on the ARMv8-A exceptions, the role of ARM Trusted Firmware (that provides Secure Monitor functionality) and how the World Switch happens between Secure and Normal. If we look on the the architectural diagram of ARM Trustzone w.r.t ARMv8-A, the Execution Level is divided into four levels namely: EL0 (Secure & Non-Secure) - User Application EL1 (Secure & Non-Secure) - Kernel EL2 - Hypervisor for running different OS's simuntaneously EL3 - Security Monitor Now, whenever a normal world User Application calls for some Secure Operation, the calls goes via IOCTL call to the Linux Driver, which ultimately calls the smc instruction. To understand what the smc instruction, we have to look on the Exceptions in ARMv8 ARMv8 Exceptions In ARMv8 the exceptions are divided into two categories: Synchronous & Asynchronous.  An exception is described as synchronous if it is generated as a result of execution or attempted executi

Setting my Yocto qemu environment for reverse engineering experimental purpose

In this post I have discussed about, how I set my ARM reverse engineering platform in Yocto Qemu. Generally when we are talking about reverse engineering then we need a target platform where we could exercise our experiments. We Can choose Raspberry pi, Beagle Bone etc. for these kind of experimentation. But what we can do in these platforms, can also be performed in some virtual environments. Only for experimenting on some kind of side channel attacks, we would need the actual hardware. For making the setup, you would need some PC with very good configuration likely atleast 4 GB of RAM, 100 GB of free space, and with atleast Quad Core Processor. I have installed VMWare (non-commercial version) which is running Ubuntu 18.04. If you have ubuntu installed in your PC itself then it is well and good. I have followed this link to install Ubuntu. The following steps would help: 1. First clone the source code of Yocto. You might also require some dependencies to get it installed: $ c

An overview of ARM Memory Management Unit

The scope of this documentation is to understand the Memory Management Unit for ARMv8 Based processor. Memory management Unit converts the virtual Address (in CPU's logical space) into Physical Address. For an example let us suppose in the following program: int variable; printf("Addrss of variable = 0x%x\n", &variable); The address could be anything (Let's assume  0x40000200 ). Now 0x40000200 may or may not the actual memory address in the Physical Memory (RAM). It could be anything thing (lets assume  0xA0000200 ). Thus the CPU produce the logical address 0x40000200 which is converted into the physical address 0xA0000200 by the Memory Management Unit. Now the question remains Why we require an Address Translation, or in other word in the above program why we don't operate on actual physical memory 0xA0000200? Let us suppose a program that requires a huge amount of contagious memory in the RAM. Now our external memory would have that much memory requ