Elf format


  • Executable and Linkable Format 101 - Part 1 Sections and Segments
  • Understanding ELF, the Executable and Linkable Format
  • Understanding the ELF File Format
  • Linux ELF Object File Format (and ELF Header Structure) Basics
  • Executable and Linkable Format 101 - Part 1 Sections and Segments

    Figure 1: ELF header This is the first stage of compilation in which the source code is processed. The following is the command by which we can restrict the compiler to just pre-process the study. Open the study. Besides, wherever these macros were used, their values are replaced by macro values. In this compilation stage, any line that starts with is interpreted as a pre-processing command. These commands have their own language format which is not discussed in detail here since online sources describe it in detail.

    Note: Try to write different macros and conditional compilation commands in the code, then run the provided pre-processing command and see the output in order to understand pre-processing in detail.

    Compilation What this stage is called is confusing, as it converts pre-processed code into assembly code. In this stage, all programming symbols are converted to assembly instructions, which are native to the target processor architecture. This is also the stage where inline assembly code in high-level programming languages is processed.

    The following command restricts the compiler to convert the study. This file will also show assembly code generated for each line of C code that we have written. Using this, you can understand how assembly instructions and C programming syntax are related.

    Note: Modify the above program and run the above command to see how assembly instructions are generated for different C syntaxes. Assembling This stage is used to convert assembly instructions to machine code or object code. This object code is the actual machine instructions that run on the target processor. This is the first stage in which the ELF file is generated. The output file is known as an object file which has an extension.

    This file cannot be directly executed, but needs to be linked using Linker, which is the final stage of compilation. To convert our study. To open this file, we can use binary utilities. ELF files and binary utilities To understand the final stage of compilation, which is linking, and to visualise how ELF files are structured, we need to understand object files, particularly ELF files.

    We were able to generate object files in the previous stage, but we were not able to open them. To open this file, we need to understand the structure of an object file. The following is the command to read the ELF header of any object file: readelf -h study The output is seen in Figure 1.

    The object file is Little Endian. The ELF file is built for an x bit machine. There are two important pieces of information present in the ELF header. ELF sections When a program is compiled, different things are generated after compilation. I have grouped these into raw groups as shown below: Binary executable code.

    Understanding ELF, the Executable and Linkable Format

    More resources What is an ELF file? ELF is the abbreviation for Executable and Linkable Format and defines the structure for binaries, libraries, and core files. The formal specification allows the operating system to interpreter its underlying machine instructions correctly. ELF files are typically the output of a compiler or linker and are a binary format.

    With the right tools, such file can be analyzed and better understood. Why learn the details of ELF? Before diving into the more technical details, it might be good to explain why an understanding of the ELF format is useful.

    As a starter, it helps to learn the inner workings of our operating system. When something goes wrong, we might better understand what happened or why. Then there is the value of being able to research ELF files, especially after a security breach or discover suspicious files. Last but not least, for a better understanding while developing. Even if you program in a high-level language like Golang, you still might benefit from knowing what happens behind the scenes.

    So why learn ELF? Generic understanding of how an operating system works Development of software Digital Forensics and Incident Response DFIR Malware research binary analysis From source to process So whatever operating system we run, it needs to translate common functions to the language of the CPU, also known as machine code.

    A function could be something basic like opening a file on disk or showing something on the screen. Instead of talking directly to the CPU, we use a programming language, using internal functions. A compiler then translates these functions into object code. This object code is then linked into a full program, by using a linker tool. The result is a binary file, which then can be executed on that specific platform and CPU type. Before you start This blog post will share a lot of commands.

    Better do it on a test machine. If you like to test commands, copy an existing binary and use that. Additionally, we have provided a small C program, which can you compile. After all, trying out is the best way to learn and compare results. We already have seen they can be used for partial pieces object code. Another example is shared libraries or even core dumps those core or a. Structure Due to the extensible design of ELF files, the structure differs per file.

    This ELF header magic provides information about the file. This ELF header is mandatory. It ensures that data is correctly interpreted during linking or execution. To better understand the inner working of an ELF file, it is useful to know this header information is used. This value determines the architecture for the file.

    The magic shows a 02, which is translated by the readelf command as an ELF64 file. In other words, an ELF file using the bit architecture. Not surprising, as this particular machine contains a modern CPU.

    Data Next part is the data field. This particular value helps to interpret the remaining objects correctly within the file. This is important, as different types of processors deal differently with the incoming instructions and data structures. The effect of LSB becomes visible when using hexdump on a binary file. So nothing interesting to remember. In addition, each of them has specific ones, or at least minor differences between them.

    This way the operating system and applications both know what to expect and functions are correctly forwarded. These two fields describe what ABI is used and the related version. In this case, the value is 00, which means no specific extension is used. The output shows this as System V.

    Machine We can also find the expected machine type AMD64 in the header. Type The type field tells us what the purpose of the file is. There are a few common file types. For example for what specific processor type the file is. Using hexdump we can see the full ELF header and its values. The value 3e is 62 in decimal, which equals to AMD To get an idea of all machine types, have a look at this ELF header file.

    While you can do a lot with a hexadecimal dump, it makes sense to let tools do the work for you. The dumpelf tool can be helpful in this regard. It shows a formatted output very similar to the ELF header file. Great to learn what fields are used and their typical values. With all these fields clarified, it is time to look at where the real magic happens and move into the next headers! One uis to be used for the linker to allow execution segments.

    The other one for categorizing instructions and data sections. So depending on the goal, the related header types are used. When the kernel sees these segments, it uses them to map them into virtual address space, using the mmap 2 system call.

    In other words, it converts predefined instructions into a memory image. If your ELF file is a normal binary, it requires these program headers. It uses these headers, with the underlying data structure, to form a process. This process is similar for shared libraries. An overview of program headers in an ELF binary We see in this example that there are 9 program headers.

    When looking at it for the first time, it hard to understand what happens here. It stores exception handlers. So when something goes wrong, it can use this area to deal correctly with it.

    The stack is a buffer, or scratch place, where items are stored, like local variables. When a process function is started a block is reserved. When the function is finished, it will be marked as free again. By manipulation of memory, one could refer to this executable stack and run intended instructions.

    The scanelf and execstack tools are two examples to show the stack details.

    Simply put, this information tells whether the data is in big endian or little endian format. This helps in parsing the ELF file. This information is stored in terms of bytes. In absence of a program header table, the information contained by this member is zero. Note that all the entries are same in size. The size is represented in form of number of bytes. Note that the product of ephnum and ephentsize gives the total size of program header table in bytes and same way the product of eshnum and eshentsize gives the total size of section header table in bytes.

    Note: Try to write different macros and conditional compilation commands in the code, then run the provided pre-processing command and see the output in order to understand pre-processing in detail.

    Understanding the ELF File Format

    Compilation What this stage is called is confusing, as it converts pre-processed code into assembly code. In this stage, all programming symbols are converted to assembly instructions, which are native to the target processor architecture. This is also the stage where inline assembly code in high-level programming languages is processed.

    The following command restricts the compiler to convert the study. This file will also show assembly code generated for each line of C code that we have written. Using this, you can understand how assembly instructions and C programming syntax are related.

    Note: Modify the above program and run the above command to see how assembly instructions are generated for different C syntaxes. Assembling This stage is used to convert assembly instructions to machine code or object code. This object code is the actual machine instructions that run on the target processor. This is important, as different types of processors deal differently with the incoming instructions and data structures.

    The effect of LSB becomes visible when using hexdump on a binary file. So nothing interesting to remember.

    Linux ELF Object File Format (and ELF Header Structure) Basics

    In addition, each of them has specific ones, or at least minor differences between them. This way the operating system and applications both know what to expect and functions are correctly forwarded. These two fields describe what ABI is used and the related version. In this case, the value is 00, which means no specific extension is used.

    The output shows this as System V. Machine We can also find the expected machine type AMD64 in the header. Type The type field tells us what the purpose of the file is. There are a few common file types. For example for what specific processor type the file is. Using hexdump we can see the full ELF header and its values. The value 3e is 62 in decimal, which equals to AMD To get an idea of all machine types, have a look at this ELF header file.

    While you can do a lot with a hexadecimal dump, it makes sense to let tools do the work for you. The dumpelf tool can be helpful in this regard. It shows a formatted output very similar to the ELF header file. Great to learn what fields are used and their typical values.

    With all these fields clarified, it is time to look at where the real magic happens and move into the next headers! One uis to be used for the linker to allow execution segments.

    The other one for categorizing instructions and data sections. So depending on the goal, the related header types are used. When the kernel sees these segments, it uses them to map them into virtual address space, using the mmap 2 system call.


    thoughts on “Elf format

    Leave a Reply

    Your email address will not be published. Required fields are marked *