Skip to main content

An overview of ARM Memory Management Unit

The scope of this documentation is to understand the Memory Management Unit for ARMv8 Based processor. Memory management Unit converts the virtual Address (in CPU's logical space) into Physical Address. For an example let us suppose in the following program:

int variable;
printf("Addrss of variable = 0x%x\n", &variable);

The address could be anything (Let's assume 0x40000200). Now 0x40000200 may or may not the actual memory address in the Physical Memory (RAM). It could be anything thing (lets assume 0xA0000200). Thus the CPU produce the logical address 0x40000200 which is converted into the physical address 0xA0000200 by the Memory Management Unit.

Now the question remains Why we require an Address Translation, or in other word in the above program why we don't operate on actual physical memory 0xA0000200?
Let us suppose a program that requires a huge amount of contagious memory in the RAM. Now our external memory would have that much memory require for the program/process, but it may or may not have the memory in contagious fashion. Even though we are accessing a logical address range of contagious memory, in actual scenario the physical memory that are linked to are scattered and the scattered page table are linked by the logical contagious memory range by our MMU.

Now if we draw a simple diagram on how the MMU is connected to RAM, it would look something like:


Thus, whenever a CPU produce a virtual address, the MMU looks inside a table for the corresponding physical address. This Table is refered as Translation Look-aside Buffer.
The Translation Lookaside Buffer (TLB) is a cache of recently accessed page translations in the
MMU.
Each TLB entry typically contains not just physical and Virtual Addresses, but also attributes such as memory type, cache policies, access permissions, the Address Space ID (ASID), and the Virtual Machine ID (VMID). If the TLB does not contain a valid translation for the Virtual Address issued by the processor, known as a TLB miss, an external translation table walk or lookup is performed. Dedicated hardware within the MMU enables it to read the translation tables in memory. The newly loaded translation can then be cached in the TLB for possible reuse if the translation table walk does not result in a page fault.

ARMv8 MMU Registers

Translation Address Base Register


  • In ARMv8 based system, the TLB entries in the Main memory, is specified by a special register called Translation Address Base Register (TBR0_ELx or TTBR1_EL1).
  • TTBR0 is selected if the upper Bits of VA are 0's and TTBR1 is selected if upper Bits of VA are 1's.
  • EL2 & EL3 has TTBR0 but no TTBR1, which means EL3 uses VA ranging from 0x00 to 0x0000FFFF_FFFFFFFF.

Translation Table Control Register

  • Top Bit Ignore (TBI) indicates that the top 16 Bits of PA must be 0 or 1. Which means that the PA of any general purpose register must be either 0x0000 or 0xFFFF. Any attempt to use different value would trigger a fault.
  • IPS (Intermediate Physical Address Size) field indicates the maximum output address size ('000' = 32 Bits, '101' = 48 Bits).  
  • Translation Granule (TGx): Granule size of kernel or User Space. ('00' = 4KB, '01' = 16KB, '11' = 64KB).
  • TxSZ: The translation would require three or four level. The level is calculated by the granule size and the value stored in Translation Table Size. 
  • SHx - TBD
  • ORGNx - TBD
  • IRGNx - TBD

Example of a level 4 translation

Important Formula to be consider 

  1. Granule = LOG2(page_size)
  2. inputSize = 64 - UINT(TCR_ELn.TxSZ)
  3. stride = Granule - 3
  4. level = 4 - ROUNDUP((inputSize - Granule)/Stride)
  5. For each level starting from low level; AddrSelTop = inputSize - 1; 
  6. AddrSelBottom = ((3 - level) * Stride + Granule)
  7. After which on the consequent level, the AaddrSelTop becomes (AaddrSelBottom -1)
  8. TBD
    So, in above diagram, considering the page size as 64KB and TxSZ as 22;
    Granule = log2(64*1024) = 16
    inputSize = 64 - 22 = 42
    stride = 16 - 3 = 13
    level = 4 - ((42 - 16)/13) = 4 - (26/13) = 4 - 2 = 2
    addrSelBottom = inputSize (just consider to fit in logic)
    for (level = 2; level < 4; level ++) {
    addrSelTop = addrSelBottom - 1
    addrSelBottom = ((3 - 2)*13 + 16) = 13 + 16 = 29
    // process addrSelTop & addrSelBottom to get next level table/block entry
    }

    NOTE: In above diagram, by the formula the addrSelTop BIT for level 2 is 41 and addrSelBottom is derived as 29. On summering, we get:

    1. If VA[63:42] = 1 then TTBR1 is used for the base address for the first page table. When VA[63:42] = 0, TTBR0 is used for the base address for the first page table.
    2. The page table contains 8192 64-bit page table entries, and is indexed via VA[41:29]. The MMU reads the pertinent level 2 page table entry from the table.
    3. The MMU checks the level 2 page table entry for validity and whether or not the requested memory access is allowed. Assuming it is valid, the memory access is allowed.
    4.  In above figure, the level 2 page table entry refers to the address of the level 3 page table (it is a table descriptor).
    5. Bits [47:16] are taken from the level 2 page table entry and form the base address of the level 3 page table.
    6. Bits [28:16] of the VA are used to index the level 3 page table entry. The MMU reads the pertinent level 3 page table entry from the table.
    7. The MMU checks the level 3 page table entry for validity and whether or not the requested memory access is allowed. Assuming it is valid, the memory access is allowed.
    8. In above figure, the level 3 page table entry refers to a 64KB page (it is a page descriptor).
    9. Bits [47:16] are taken from the level 3 page table entry and used to form PA[47:16].
    10. Because we have a 64KB page, VA[15:0] is taken to form PA[15:0].
    11. The full PA[47:0] is returned, along with additional information from the page table entries.

    Presence of EL2

    The virtualization extensions to the ARMv8-A architecture introduce a second stage of translation. When a hypervisor is present in the system, one or more guest operating systems might be present.
    The hypervisor must perform some extra translation steps in a two stage process to share the physical memory system between the different guest operating systems. In the first stage, a Virtual Address (VA) is translated to an Intermediate Physical Address (IPA). This is usually under OS control. A second stage, controlled by the hypervisor, then performs translation of the IPA to the final Physical Address (PA).

    Secure State EL3_MON

    The Secure monitor EL3 has its own dedicated translation tables. The table base address is specified in TTBR0_EL3 and configured via TCR_EL3. Translation tables are capable of accessing both Secure and Non-secure Physical Addresses. TTBR0_EL3 is used only in Secure monitor EL3 mode, not by the trusted kernel itself. When the transition to Secure world has completed, the trusted kernel uses the EL1 translations, that is, the translation tables pointed to by TTBR0_EL1 and TTBR1_EL1. As these registers are not banked in AArch64, Secure
    monitor code must configure new tables for the Secure world and save and restore copies of TTBR0_EL1 and TTBR1_EL1.

    The EL1 translation regime behaves differently in Secure state, compared to its normal operation in Non-secure state. The second stage of translation is disabled and the EL1 translation regime is now able to point to both Secure or Non-secure Physical Addresses.
    Entries in the TLB are tagged as Secure or Non-secure, so that no TLB maintenance is ever required when you transition between Secure and Normal worlds.

    REFERENCES

    Comments

    1. Cyber Security Course in Delhi
      https://www.reusealways.com/read-blog/115576_top-highest-paying-cyber-security-jobs.html
      APTRON gives the Best Cyber Security Course in Delhi for Beginners to their ongoing workers and creates more master experts in the class of Information Security, APTRON is offering real Cyber Security Training and Certifications to the hopefuls whosoever need to construct an extraordinary vocation in this roaring foundation.

      ReplyDelete

    Post a Comment

    Popular posts from this blog

    ARM Trustzone - An overview on how SMC calls are handled by the EL3 Monitor

    In this write up, we will focus mainly on the ARMv8-A exceptions, the role of ARM Trusted Firmware (that provides Secure Monitor functionality) and how the World Switch happens between Secure and Normal. If we look on the the architectural diagram of ARM Trustzone w.r.t ARMv8-A, the Execution Level is divided into four levels namely: EL0 (Secure & Non-Secure) - User Application EL1 (Secure & Non-Secure) - Kernel EL2 - Hypervisor for running different OS's simuntaneously EL3 - Security Monitor Now, whenever a normal world User Application calls for some Secure Operation, the calls goes via IOCTL call to the Linux Driver, which ultimately calls the smc instruction. To understand what the smc instruction, we have to look on the Exceptions in ARMv8 ARMv8 Exceptions In ARMv8 the exceptions are divided into two categories: Synchronous & Asynchronous.  An exception is described as synchronous if it is generated as a result of execution or attempted executi...

    Setting my Yocto qemu environment for reverse engineering experimental purpose

    In this post I have discussed about, how I set my ARM reverse engineering platform in Yocto Qemu. Generally when we are talking about reverse engineering then we need a target platform where we could exercise our experiments. We Can choose Raspberry pi, Beagle Bone etc. for these kind of experimentation. But what we can do in these platforms, can also be performed in some virtual environments. Only for experimenting on some kind of side channel attacks, we would need the actual hardware. For making the setup, you would need some PC with very good configuration likely atleast 4 GB of RAM, 100 GB of free space, and with atleast Quad Core Processor. I have installed VMWare (non-commercial version) which is running Ubuntu 18.04. If you have ubuntu installed in your PC itself then it is well and good. I have followed this link to install Ubuntu. The following steps would help: 1. First clone the source code of Yocto. You might also require some dependencies to get it installed: $ c...