Skip to main content

An analysis on ELF files

During my journey towards the reverse engineering of boot binary, I tried a lot of way to disassemble a boot binary. Generally  hackers use tools like IDA pro. But this tools comes with a cost (IDA pro costs 1000 Dollar for single user license). Freeware version of IDA pro is available (for non-commercial use only), but this comes with limited disassembler functionality (like ARMv7 & ARMv8 not supported).
A lot of open sourced tools like Ghidra (from NSA) is present, but it has very much limited functionality.

So, I finally decided to disassemble a flat binary from my own (may be using GNU). But it is not a straight forward task. When we build a boot binary (let say OPTEE or bootloader or ATF), at first a Executable & Linkable File is created then using objcopy tool a flat binary file is finally produced which goes to the ROM. In the continued series of this post, we will crack from scratch what is the significance of ELF, its various sections, why ELF is not flashed in ROM and instead why flat binary files are used, and finally we will try to disassemble some small boot binary.

So Let's start with a normal HELLO WORLD program (let's say helloworld.c), we will build using gcc.

#include <stdio.h>

int main()
{
    printf("HELLO WORLD\n");
    return 0;
}

Then we build using the command (gcc helloworld.c -o outfile). The output will generate an ELF file. The objective of this post to analyze this ELF file.

So, in this experiment we will be using these tools: readelf, objdump, xxd

An ELF file contain three main type of headers:

  1. File Header/ELF header: Contain the top level information like for which machine the ELF is build for, Endianess of binary, class of binary (ELF64/32), start address of Program headers & the section headers. readelf -h outfile command is used to extract the ELF header.
  2. Program Header: It tells the system about how to create the process. readelf -l outfile command is used to extract the program header. In my outfile there were 9 program header. The first one of type PHDR is used to represent the program header segment itself. Now each program header represent each segment. Some are of type LOAD (loadable segment) that tells the system about all the loadable data like .text, .rodata, .bss, .data etc while creating the process. 
     
  3. Section Header: When a program is build, the object file contain lot of sections like .texr, .rodata, .bss, data, .symtab, etc. The section header depict each one of these sections. One very interesting section that I explored among them is the strtab section that contain all the section name in concatenated form. I guess the while loading the binary, the linux kernel first check and ensures that some particular section (let say .text) is present in the ELF or not. readelf -S outfile command is to be used to get the section header.


For reference the ELF wiki link gives a very good explanation about the ELF files.
Also this NPTEL youtube video explains a lot about ELF headers.
Maybe next we will explore something more on reverse engineering some helloworld program using the readelf, objdump and xxd tools.

Comments

  1. I like your all post. You have done really good work. Thank you for the information you provide, it helped me a lot. I hope to have many more entries or so from you.
    Very interesting blog.
    crackpur.info
    IDA Pro Crack

    ReplyDelete
  2. I like your all post. You have done really good work. Thank you for the information you provide, it helped me a lot. I hope to have many more entries or so from you.
    Very interesting blog.
    crackpur.info
    IDA Pro Crack

    ReplyDelete

Post a Comment

Popular posts from this blog

ARM Trustzone - An overview on how SMC calls are handled by the EL3 Monitor

In this write up, we will focus mainly on the ARMv8-A exceptions, the role of ARM Trusted Firmware (that provides Secure Monitor functionality) and how the World Switch happens between Secure and Normal. If we look on the the architectural diagram of ARM Trustzone w.r.t ARMv8-A, the Execution Level is divided into four levels namely: EL0 (Secure & Non-Secure) - User Application EL1 (Secure & Non-Secure) - Kernel EL2 - Hypervisor for running different OS's simuntaneously EL3 - Security Monitor Now, whenever a normal world User Application calls for some Secure Operation, the calls goes via IOCTL call to the Linux Driver, which ultimately calls the smc instruction. To understand what the smc instruction, we have to look on the Exceptions in ARMv8 ARMv8 Exceptions In ARMv8 the exceptions are divided into two categories: Synchronous & Asynchronous.  An exception is described as synchronous if it is generated as a result of execution or attempted executi...

An overview of ARM Memory Management Unit

The scope of this documentation is to understand the Memory Management Unit for ARMv8 Based processor. Memory management Unit converts the virtual Address (in CPU's logical space) into Physical Address. For an example let us suppose in the following program: int variable; printf("Addrss of variable = 0x%x\n", &variable); The address could be anything (Let's assume  0x40000200 ). Now 0x40000200 may or may not the actual memory address in the Physical Memory (RAM). It could be anything thing (lets assume  0xA0000200 ). Thus the CPU produce the logical address 0x40000200 which is converted into the physical address 0xA0000200 by the Memory Management Unit. Now the question remains Why we require an Address Translation, or in other word in the above program why we don't operate on actual physical memory 0xA0000200? Let us suppose a program that requires a huge amount of contagious memory in the RAM. Now our external memory would have that much memory requ...

Reverse Engineering an ARM binary

In continuation to my previous experiment , this experiment is all about hacking into an ELF binary file. Means we will change the characteristics of and ELF file by reverse engineering its assembly instructions. For this experiment I have choose AARCH64 binary which is suppose to run in an 64 bit ARM machine. Here I have used these utilities: readelf / aarch64-linux-gnu-readelf aarch64-linux-gnu-objdump - Used to dump all the assembly instructions in a binary xxd (which I feel one of most power free weapon of reverse engineering). NOTE that the toolchain I have installed while building the raspberry secure images . To understand the reverse engineering, one should atleast know the forward engineering that means the basis conditional statements (if, else) and loop statements (for / while). I've demonstrate a program which takes input string (key), compare it with some hard-coded one and accordingly execute the access condition. This is something like an decade ol...