14. Memory Management Unit (MMU)

The reader is advised to first read the chapter on supervisor as well as the hypervisor extension of the RISC-V Instruction Set Manual, Volume II: Privileged Architecture, Version 1.11 to fully comprehend the information of this chapter.

The core has a memory management unit which includes separate instruction and data TLBs (Translation Look-aside Buffers). The TLBs and the Page Table Walk (PTW) modules support sv32, sv39, sv48 and sv57 virtualization scheme of RISC-V Instruction Set Manual, Volume II: Privileged Architecture, Version 1.11.

14.1. Virtual Address Translation process

The process for a given virtual memory address to get translated to its corresponding physical address, is called virtual address translation process. In the presence of SUPERVISOR(S) all virtual memory address undergoes a single stage address translation process to get translated to its corresponding physical address, when HYPERVISOR is present, the virtual address can undergo two stage address translation process, where in stage1 the Guest virtual address(GVA) gets translated to Guest physical address(GPA) and in stage2 the GPA gets translated to supervisor physical address. The two stages are known also as VS-stage and G-stage translation.

<write here that stge 1 and stage 2 thing >Described below is a address translation process

The following address translation scheme if picked from section 4.3.2 of the RISC-V Privilege Spec. A virtual address va is translated into physical address pa as follows:

  1. Let a be satp.ppn × PAGESIZE, and let \(i = LEVELS − 1\). (For Sv32, PAGESIZE= \(2^{12}\) and LEVELS=2.) The satp register must be active, i.e., the effective privilege mode must be S-mode or U-mode.

  2. Let pte be the value of the PTE at address \(a+va.vpn[i]×PTESIZE\). (For Sv32, PTESIZE=4.) If accessing pte violates a PMA or PMP check, raise an access-fault exception corresponding to the original access type.

  3. If \(pte.v = 0\), or if \(pte.r\) = 0 and \(pte.w\) = 1, or if any bits or encoding that are reserved for future standard use are set within pte, stop and raise a page-fault exception corresponding to the original access type.

  4. Otherwise, the PTE is valid. If \(pte.r\) = 1 or \(pte.x\) = 1, go to step 5. Otherwise, this PTE is a pointer to the next level of the page table. Let \(i = i − 1\). If \(i < 0\), stop and raise a page-fault exception corresponding to the original access type. Otherwise, let \(a = pte.ppn × PAGESIZE\) and go to step 2.

  5. A leaf PTE has been found. Determine if the requested memory access is allowed by the pte.r, pte.w, pte.x, and pte.u bits, given the current privilege mode and the value of the SUM and MXR fields of the mstatus register. If not, stop and raise a page-fault exception corresponding to the original access type.

  6. If \(i > 0\) and \(pte.ppn[i − 1 : 0] \neq 0\), this is a misaligned superpage; stop and raise a page-fault exception corresponding to the original access type.

  7. If \(pte.a = 0\), or if the original memory access is a store and \(pte.d = 0\), either raise a page-fault exception corresponding to the original access type, or:

  • If a store to pte would violate a PMA or PMP check, raise an access-fault exception corresponding to the original access type.

  • Perform the following steps atomically:

    • Compare pte to the value of the PTE at address \(a + va.vpn[i] × PTESIZE\).

    • If the values match, set pte.a to 1 and, if the original memory access is a store, also set pte.d to 1.

    • If the comparison fails, return to step 2

  1. The translation is successful. The translated physical address is given as follows:

  • \(pa.pgoff = va.pgoff\).

  • If \(i > 0\), then this is a superpage translation and \(pa.ppn[i − 1 : 0] = va.vpn[i − 1 : 0]\).

  • \(pa.ppn[LEVELS − 1 : i] = pte.ppn[LEVELS − 1 : i]\).

The above algorithm applies to Sv39, Sv48, and Sv57 with the following appropriate changes:

  • for Sv39 \(PTESIZE=8\) and \(LEVELS=3\)

  • for Sv48 \(PTESIZE=8\) and \(LEVELS=4\)

  • for Sv57 \(PTESIZE=8\) and \(LEVELS=5\)

Any level of PTE may be a leaf PTE, so

  • in addition to 4 KiB pages, Sv32 supports 4 MiB megapages.

  • in addition to 4 KiB pages, Sv39 supports 2 MiB megapages and 1 GiB gigapages

  • in addition to 4 KiB pages, Sv48 supports 2 MiB megapages, 1 GiB gigapages, and 512 GiB terapages

  • in addition to 4 KiB pages, Sv57 supports 2 MiB megapages, 1 GiB gigapages, 512 GiB terapages, and 256 TiB petapages,

The Virtual Address, Physical Address and the Page Table Entry (PTE) formats for each supported virtualization mode is presented below.

  • Sv32 virtualization mode:

    • virtual address.

      11021123122page offsetVPN[0]VPN[1]121010
    • physical address.

      11021123322page offsetPPN[0]PPN[1]121012
    • page table entry.

      012345679819103120VRWXUGADRSWPPN[0]PPN[1]1111111121012
  • Sv39 virtualization mode:

    • virtual address.

      110201229213830page offsetVPN[0]VPN[1]VPN[2]12999
    • physical address.

      110201229215530page offsetPPN[0]PPN[1]PPN[2]129926
    • page table entry.

      53436054626163PPN[2]ReservedPBMTN2672127224228PPN[1]PPN[2]92612345679818102119RWXUGADRSWPPN[0]PPN[1]1111111299
  • Sv48 virtualization mode:

    • virtual address.

      1102012292138304739page offsetVPN[0]VPN[1]VPN[2]VPN[3]129999
    • physical address.

      1102012292138305539page offsetPPN[0]PPN[1]PPN[2]PPN[3]1299917
    • page table entry.

      53436054626163PPN[3]ReservedPBMTN17721272236284237PPN[1]PPN[2]PPN[3]991712345679818102119RWXUGADRSWPPN[0]PPN[1]1111111299
  • Sv57 virtualization mode:

    • virtual address.

      11020122921383047395648page offsetVPN[0]VPN[1]VPN[2]VPN[3]VPN[4]1299999
    • physical address.

      11020122921383047395548page offsetPPN[0]PPN[1]PPN[2]PPN[3]PPN[4]1299998
    • page table entry.

      454353466054626163PPN[3]PPN[4]ReservedPBMTN98721272236284237PPN[1]PPN[2]PPN[3]99912345679818102119RWXUGADRSWPPN[0]PPN[1]1111111299

In the presence of hypervisor, the virtual address space also gets quadrupled so the virtualization schemes sv32, sv39, sv48 and sv57 becomes sv32x4, sv39x4, sv48x4 and sv57x4. Described below is the translation process for the hypervisor in which the basic translation process remains same but some added complexity is there for switching the translation process between stages.

Note

For a n-level PTW(page table walk), n is bounded by two things first one is the virtualization mode the PTW is happening which bounds the max value of n and the second one is the page size (1MB page / 1GB page etc.) that virtual address belongs to, which tells at which level the PTW will get the ppn from for example for a sv39 mode where the virtual address belongs to 2MiB page then it can have 3-level PTW but for a 2MiB page ptw will get the ppn from 2-level of the PTW

_images/ptw_Address_translation.png

The description below describes the block diagram presented above and assumes that the GVA and GPA belong to 4KiB page, and the VS(virtulized supervisor) mode is sv39 and HS(hypervisor enabled supervisor) mode is sv39x4. The virtual address generated by the core in VS mode needs to get translated to it’s physical memory

  1. The guest virtual address(GVA) will come from core, when it’s in VS mode, which first get translated to into GPA. This stage of tranlating the GVA into GPA in treated as VS-stage or stage1 translation. The value of GPA is the concatenation of vssatp.ppn and gva[2].

  2. GPA is then translated to HPA where the ‘LEVELS’, ‘VMID(similar to ASID)’ and root ppn is given by HGATP csr , and will undergoe a 3-level page table walk (this depends on the permission bits associated with the PTE(page table entry)). This stage of converting the GPA into HPA is treated as G-stage translation, and will follow the similar rules for address translation process which was described above for supervisor address translation process.

  3. The leaf node in G-stage translation will give the final HPA. This HPA that is the translation of GPA is created by concatenating the pte.ppn of the leaf node and page offset which are 12 least significant bits of the GPA

  4. This HPA is the physical address of GPA in the host machine so a memory access is done using this address to figure out the PTE the GPA points to. This is called implcit memory access in RISCV documentation for two stage address translation.

  5. The response of memory access is a pte and the ppn is concatenated with gva[1] to get the gpa2.

  6. The step1 mentioned above is the 1st level of the VS-stage translation where part of GVA (gva[2]) is used to create GPA which was then translated to HPA in G-stage translation step2-4, and at step5 which is a part of VS-stage gets us the next GPA which will get translated to HPA in a similar fashion as step2-4 with appropriate inputs metioned in the block diagram above. This is repeated until we get a leaf node in VS- stage where we get the final GPA. After we get the final gpa a final G-stage translation is done to convert GPA to HPA and appropriate values are send as a response from PTW to TLBs which mainly consist of requested GVA whose translation was requested, then the GPA along with its permissions bits which are used for the final G-stage translation and HPA along with its permission bits that we get at end of the walk.

Since the virtual address has addidtion two bits in hypervisor(to account for the quadrupling of the address space) below are the formats for each supported virtualization mode. Addtional two bit gets added at the high end of VPN

  • Sv32x4 virtualization mode:

    • virtual address.

      11021123322page offsetVPN[0]VPN[1]121012
    • physical address.

      11021123322page offsetPPN[0]PPN[1]121012
  • Sv39x4 virtualization mode:

    • virtual address.

      110201229214030page offsetVPN[0]VPN[1]VPN[2]129911
  • Sv48x4 virtualization mode:

    • virtual address.

      1102012292138304939page offsetVPN[0]VPN[1]VPN[2]VPN[3]1299911
  • Sv57x4 virtualization mode:

    • virtual address.

      11020122921383047395848page offsetVPN[0]VPN[1]VPN[2]VPN[3]VPN[4]12999911

14.2. The Svanpot extension

In Sv39, Sv48, and Sv57, when a PTE has N=1, the PTE represents a translation that is part of a range of contiguous virtual-to-physical translations with the same values for PTE bits 5–0. Such ranges must be of a naturally aligned power-of-2 (NAPOT) granularity larger than the base page size.

Page table entry encodings when pte.N=1

i

pte.ppn[i]

Description

pte.napot bits

0

x xxxx xxx1

Reserved

0

x xxxx xx1x

Reserved

0

x xxxx x1xx

Reserved

0

x xxxx 1000

64 KiB contiguous region

4

0

x xxxx 0xx1

Reserved

>=1

x xxxx xxxx

Reserved

NAPOT PTEs behave identically to non-NAPOT PTEs within the address-translation algorithm described above, except that:

  • If the encoding in pte is valid according to Table above, then instead of returning the original value of pte, implicit reads of a NAPOT PTE return a copy of pte in which pte.ppn[pte.napot bits − 1 : 0] is replaced by vpn[i][pte.napot bits − 1 : 0]. If the encoding in pte is reserved according to Table, then a page-fault exception must be raised.

  • Implicit reads of NAPOT page table entries may create address-translation cache entries mapping a + va.vpn[j] × PTESIZE to a copy of pte in which pte.ppn[pte.napot bits − 1 : 0] is replaced by vpn[0][pte.napot bits − 1 : 0], for any or all j such that j[8 : napot bits] = i[8 : napot bits], all for the address space identified in satp as loaded by step 0.

14.3. TLBs (Translation Look-aside Buffers)

TLBs can be considered as caches of the page-table entries residing in the memory. They basically store the translation of a virtual memory address to a physical memory address. The core implements different TLBs for instruction fetch addresses (refered to as ITLB) and data load/store addresses (refered to as DTLB). The configurator allows configuring each TLB separately and 2 major choices of implementations are available for each : set-associative architectures and full-associative architectures. The configurator defaults to choosing fully-associative architecture for 1 entry for both the ITLB and DTLB. Below is a brief summary of the configuration parameters which can be found in configure_s_extension as well.

  • Associativity : It can be either set_associative or fully_associative

  • Size of the TLBs :

    • For fully associative the parameter tlb_size is associated to the number of entries the tlb can have

    • For set associative a dictionary is used to configure the sets and ways for a given tlb of a particular page size

  • Replacement algo :

    • For fully associative the parameter replacement is used to set the replacement algorithm.

    • For set associative replacement field, inside the dictionary is used to configure the tlb with the replacement algorithm.

    • For this field, below are the legal set of values associated to the different replacement algorithm

      • 0: Random replacement algorithm

      • 1: Round Robin replacement algorithm

      • 2: Pseudo LRU

Note

Note currently the replacement field inside TLBs can only be configured with a random replacement algorithm. Other algorithms will be supported in later versions

Warning

Note that presently the minimum number that tlb_size, sets and ways fields can take is 1, i.e all the supported page-sizes for a given virtualization mode have to be instantiated.

Using the above set of parameters, the configurator will further generate the following set of design time BSV macros :

  • max_var_pages: indicates the number of page types in the max supported virtualization scheme

  • subvpn: indicates the size in number of bits of each sub vpn (virtual page number)

  • lastppnsize: indicates the size in number of bits of the last (msb) physical page number

  • maxvaddr: indicates the max size of the virtual address

  • vpnsize: indicates the size of the virtual page numbers within the virtual address

  • svnapot: when 1, indicates svnapot is enabled.

  • simpl_sfence: when defined, indicates that the sfence mechanism is simple which flushes all the TLBs irrespective of the asid and virtual address

  • dtlb_[fa|sa] : indicates that the data TLB is implemented as either fully-associated(fa) or set-associative(sa)

  • dtlb_sets_[4kb,4mb,2mb,1gb,512gb,256tb]: each macro defines the number of sets of each page size that needs to be instantiated based on the max virtualization mode implemented

  • dtlb_ways_[4kb,4mb,2mb,1gb,512gb,256tb]: each macro defines the number of ways of each page size that needs to be instantiated based on the max virtualization mode implemen

  • dtlb_rep_alg_[4kb,4mb,2mb,1gb,512gb,256tb]: each macro defines the replacement algorithm chosen for each page size that needs to be instantiated based on the max virtualization mode implemen

  • itlb_[fa|sa] : same as dtlb_[fa|sa] but applied to instruction TLBs

  • itlb_sets_[4kb,4mb,2mb,1gb,512gb,256tb]: same as above, but applied to instruction TLBs

  • itlb_ways_[4kb,4mb,2mb,1gb,512gb,256tb]: same as above, but applied to instruction TLBs

  • itlb_rep_alg_[4kb,4mb,2mb,1gb,512gb,256tb]: same as above, but applied to instruction TLBs

14.3.1. Working principle

This section provides a brief discussion on the working of the various supported TLB architectures The aim of the TLB is to cache and translate the virtual address when presented to it with the corresponding physical address of the memory location. Once a valid translation is available, permission checks as per the above section are performed and a suitable trap is raised and passed back to the caches.

Irrespective of the configuration of the TLBs, all data tlbs include the following interface definition:

_images/tlbs-tlb-interface.png

Fig. 14.1 Interface diagram for all TLB architectures

As shown in Fig. 14.1, the TLBs have 4 basic communication ports:

  • Request from Pipeline: This input is provided by the pipeline when a new virtual address needs to be translated to a physical address.

  • Response to Cache: Once the translation is done (successful or unsuccessful) the result is sent back to the respective cache (instruction or data cache as the case maybe) for further processing.

  • Request to PTW: This output is asserted when a miss in the TLB occurs, i.e. the translation of a particular virtual address to the physical address is not available within the TLB and thus need to perform a Page Table Walk. This port basically forwards the original virtual address request from the pipeline to the PTW module.

  • Response from PTW: This input is driven by the PTW module when the page walk is complete for a given virtual address.

The working of ITLB and DTLB are similar but the differ in sending out response. When a miss ouccr both DTLB and ITLB sends the request for a PTW. After finishing the “walk” for DTLB the response first gets update inside the TLB i.e. a entry is allocated to it and the request is replaced by the DCACHE and this time since the entry was made last time TLB reponds with a hit and the corresponding translated physical address for that given virtual address, but in ITLB the as soon as the TLB recieves the response from PTW it updates the entry inside the TLB and parallely send the response to core with the corresponding translated physical address. It is true given that there are no fault or traps in the response from the PTW

The next sections will now describe how the fully-associative and set-associative architectures of the TLBs have been implemented.

14.3.1.1. Set-associative TLBs

In a set associative architecture we implement separate TLBs for different page sizes. We refer to each of these as splitTLBs for ease. Because RISC-V’s virtualization modes support different page sizes, accessing a single-set associative architecture can become challenging as the index for access changes depending on the page-size. To side-step this issue, we implement a splitTLB for each possible page size supported by the maximum virtualization mode implemented by the core. Thus, for Sv39 we have one 4KiB splitTLB, one 2MiB splitTLB and one 1GiB splitTLB to store translations for 4KiB pages, 2MiB pages and 1GiB pages respectively. The block diagram of the over set-associative TLB is show in Fig. 14.2.

_images/tlbs-sa-tlb.png

Fig. 14.2 Block diagram of the overall Set-Associative TLB architecture

The entire process of a TLB access is split into 2 cycles. In the first cycle when a request for translation is received (either from the core pipeline or from the PTW module), the request is forwarded to each splitTLB that is instantiated, which inturn select the relevant entry of a set from the each way and store them in a register. In the next cycle, the registered output from each splitTLB are checked for a possible hit and if relevant permissions are available for the requested access (as per the policies defined in Section 14.1). If either of splitTLBs detect a hit of if the requested access is for a transparent translation the physical address is generated and responded to the cache, who will further use it for tag-matching and other purposes.

In case of a miss, the original request is sent to the PTW in the second cycle. The PTW, after performing the walk, will respond back to the TLB with either a successful leaf PTE or with a fault. If a leaf PTE is found, then based on the page-size/level of the PTE, the relevant splitTLB is forwarded the response from the PTW to allocate an entry.

Tip

In such splitTLB based architecture, typically TLBs of lower page sizes will have far more entries as compared to TLBs of larger page sizes, hence the varying box sizes of the splitTLBs in the above diagram.

Fig. 14.3 shows the architecture of a single parameterized splitTLB module that is instantiated multiple times in Fig. 14.2. The working of the splitTLB and various operations performed by the set associative TLB architecture are explained in detail in the below sections

_images/tlbs-split-tlb.png

Fig. 14.3 Block diagram of a single split-TLB used in a set-associative architecture for Sv39

14.3.1.1.1. Serving requests

The request to the TLB can be either by a load/store request from the core-pipeline or a read request from the PTW module. A new request can only be entertained by the TLB only when an SFence operation is not in progress. A new request from the core pipeline can only be processed if a Page walk of a previous request is not in progress.

Once a request is received 2 actions are performed :

  • The request is enqueued into a pipeline FIFO ff_core_request so that the request can be processed in the next cycle

  • A parallel lookup is initiated for all splitTLBs for the same request.

A new request has the following fields:

  • satp: the current value of the satp CSR is required to capture the current ppn value, virtualization mode and the asid values.

  • sum: this one bit field from mstatus CSR will indicate if the supervisor has permissions to access user pages or not

  • mxr: this one bit field from mstatus CSR will enable/disable loads from pages marked executable or readable.

  • priv: This 2 bit field indicates the privilege mode under which the translation needs to happen.

  • sfence: This one bit field indicates if an sfence operation needs to be carried out.

  • ptwalk_req: This one bit field indicates if the request is from ptwalk or the core pipeline.

  • ptwalk_trap: indicates if the ptwalk exncountered a fault while performing the page table walk for a tlb miss.

  • cause: in case of a fault during a PTW, this field indicates the cause of the fault.

  • access: This 2 bit field indicates if the request is a load, store or a fetch operation.

  • address: the xlen-sized virtual address that needs to be translated.

  • rs1addr: the 5-bit rs1 register index used for sfence.

  • rs2addr: the 5-bit rs2 register index used for sfence.

  • rs1: the rs1 register value used for sfence.

  • rs2: the rs2 register value used for sfence.

Once the splitTLBs receive the request from the higher level module, each splitTLB extracts the set-index from the virtual address depending on the page-size supported by that splitTLB. For e.g. a splitTLB having 16 sets and supporting 2MiB pages (in Sv39 mode) will use bits 23 to 21 of the virtual address to extract an entry from each way as shown in Fig. 14.3. An entry from each way of each splitTLB is then latched into a lookup register which is used in the next cycle to perform a tag match within the splitTLB. The splitTLBs also maintain a separate array of valid bits for each entry in the TLBs. This allows quick invalidation, but requires sanitizing the validity of an entry by combining it with the valid bit of the corresponding PTE in the entry.

Each entry in the splitTLB has the format shown in Fig. 14.4. The description of each field is as follows:

_images/tlbs-tlb-entry.png

Fig. 14.4 Fields of a TLB Entry

  • permissions: This contains the 10 bits of permissions found in any leaf PTE

  • ppn: The physical page number that will be used to create the final physical address

  • asid: The ASID value under which this pte is valid

  • tag: The upper bits of the virtual page number that will be used during tag match.

Thus, within a splitTLB, the tag-match function first checks if the upper-bits of the requested virtual address match the tag field of the selected entry and further checks if the asid field value matches the current asid value in the satp CSR or if the page is a global page (G bit in the permissions is set). If these conditions are met, then a hit is declared and the TLB entry is passed on to the top level module, where access permissions are checked before generating a physical address and responding to the cache.

At the top level, the responses from all splitTLBs are collected and the TLB with the hit entry is selected for which permissions are checked. In case multiple splitTLBs indicate a hit, the TLB with the large page-size is given priority and thus used to create a physical address. In case none of the ways of the splitTLB indicate a hit, then that splitTLB raises the miss signals. If all splitTLBs raise a miss signal, then the original virtual address request is forwarded to the PTW module to find a leaf pte.

While the above working is described with respect to the DTLB, the ITLB works in the exact same fahsion with the exception that the permission checks are also performed in the first cycle itself.

14.3.1.1.2. Serving response from PTW

When a miss occurs in a TLB (top level module), then the request is forwarded to the PTW module. The PTW is then expected to respond either with a leaf pte or with a page fault. If a leaf pte is found by the PTW, then the pte is allocated to the respective splitTLB based on the page-size of the pte.

Once an entry is allocated, the DTLB expects the PTW module to re-run the original request (explained in default in Section 14.4) in the consecutive cycle, which should now turn out to be a hit in the TLB. The ITLB however, differs here, as it directly responds to the cache with the leaf pte and the translated address in the same cycle as the PTW responds.

If the PTW module responds with a page-fault, the DTLB does not allocate an entry in any splitTLB and simply ignores the response. The ITLB also does not allocate the entry, but forwards the response to the cache for further processing.

14.3.1.1.3. SFence operation

The core request can be sfence request, the user can configure the fence operation either as a simpl_fence where all TLB flush is performed as soon sfence request is recieved, or complex sfence where some conditions are to satisfy before a TLB entry is to be flushed, which are described later in this section.

For simpl_sfence operation as described above tlb flushes all the tlb entry there is, inside all the different page size TLBs, without considering any condition, by setting all the entries of TLB_VALID register to false.

If simpl_sfence is not defined then the core is configured to have a complex sfence. When the core is configured with complex fence core passes few more information along with a boolean value indicating whether the request is a sfence or not. Register below shows different fields of information the core passes follow by a explanation of each of the fields

601171412191524203125opcoderdfunct3rs1rs2funct7111111

rs1: indicates virtual add that needs to be flushed in the tlbs rs2: indicates the asid that needs to be flushed in the tlbs

Below mentioned are the condition and their repurcussion for a complex sfence operation.

  • If rs1=x0 and rs2=x0, the fence orders all reads and writes made to any level of the pagetables, for all address spaces. The fence also invalidates all address-translation cache entries, for all address spaces.

  • If rs1=x0 and rs2!=x0, the fence orders all reads and writes made to any level of the page bles, but only for the address space identified by integer register rs2. Accesses to global ppings (see Section 4.3.1) are not ordered. The fence also invalidates all address-translation che entries matching the address space identified by integer register rs2, except for entries ntaining global mappings.

  • If rs1 != x0 and rs2=x0, the fence orders only reads and writes made to leaf page table entries responding to the virtual address in rs1, for all address spaces. The fence also invalidates all address-translation cache entries that contain leaf page table entries corresponding to the virtual address in rs1, for all address spaces.

  • If rs1 != x0 and rs != x0, the fence orders only reads and writes made to leaf page table entries corresponding to the virtual address in rs1, for the address space identified by integer register rs2. Accesses to global mappings are not ordered. The fence also invalidates all address-translation cache entries that contain leaf page table entries corresponding to the virtual address in rs1 and that match the address space identified by integer register rs2, except for entries containing global mappings.

Note

To invalidate a entry inside the TLB suring sfence or in any other case, TLB_VALID register deassert the bit corresponding to intended entry.

To indicate the end of Sfence operation register FENCE shown in Fig. 14.4 is set to False, so that TLB can futher take request from core/ptwalk.

operatrion is performed by considering the fields shown in the section 4.2.1 of RISC-V Instructino Set Manual, Volume II: Privileged Architecture, Version 1.11. The tlb strictly follow the conditions metioned inside section 4.2.1, and for flushing the tlb entry TLB_VALID register correspnoding to that tlb entry is set to false.

14.3.1.2. Fully-Associated TLBs(supervisor)

When a virtual address is presented to the TLB, a look up is performed in the same cycle to check if the corresponding PTE entry exists in the TLB. The result of the lookup is stored in an intermediate register. On the consecutive cycle, in case of a hit(when there is a successful tag match), the PTE is extracted from the TLB entry and permission checks are performed in accordance with permission bits preset in the page table entry. If the permissions fails then a correspnoding page fault exception is raised and indicated to the core pipeline.

In case of a transparent translation or no translation the TLBs responds with the virtual address it was given for the translation.

In case of a miss in the TLB, the virtual address is sent to the hardware Page Table Walk(PTW) to fetch the corresponding page table entry from memory. The PTW now performs multiple memory accesses via the data cache. If any of the memory access cause a trap, then the corresponding access fault is raised and indicated to the core pipeline. Once a leaf node is detected, the PTW responds to the requesting TLB with the new PTE. The TLB then proceeds to complete the original operation.

As mentioned in set associative TLBs, PTE can corresponsd to different page sizes, in fully-associative TLBs to deal with this we have introduced pagemask field inside TLB entry. Pagemask field is set by level field in the PTW_response which tell which level the PTW ended and we create a mask according to the page size, so when a lookup is performed the pagemask field is used identify the page size and to create the mixture of ppn and vpn as mentioned in point 8 of the address translation section Section 14.1.

When a NAPOT PTE is detected, proper mask is created to replace the required bits of ppn with vpn and pagemask is updated accordingly inside the TLB entry.

Now we will discuss what does a hit or tag match means here, when TLB gets a virtual address it performs a lookup for that virtual address and performs a tag match process with all of its TLB entries. The tag match process checks for the follwing things

  • The vallid bit it true for the TLB entry

  • ASID matches or global permission bit is set

  • VPN matches, where the pagemask field mask the requested vpn and this masked vpn is then matched with the vpn field of the TLB entry

if all of the above mentioned conditioned is true then it called a hit in the TLB or a successful tag match.

Next section shows a detailed summarised version of all the fields inside a TLB entry.

14.3.1.2.1. TLB fields

TLB entry mentioned below is for a sv39, and the entry size will change in accordance with the virtualization scheme the core is running on.Below are the factor affecting the TLB entry size

  • vpnsize : Number of bits used to represent virtual page number without the page offset(12 bits). For sv39 its 27(39-12).

  • ppnsize : Number of bits used to represent physical page number. For RV64 its 44 and for RV32 its 22.

  • max_varpages : Number of page table levels. For sv39 its 3.

  • subvpn : Number of bits used to represent subvpn, max_varpages and subvpn is used to decide the lenght of pagemask field. For RV64 subvpn is of 9 bits and for RV32 its 10 bits

  • asidwidth : Number of bits used to represent ASID. For RV64 its 16 bits and for RV32 its 9 bits

67629268ASIDPAGEMASK1618 = subvpn(9) * max_level(3)51316152PPNASID441601234567308VRWXUGADPPN1111111144

TLB entry contains various fields described as follow: The fields of TLB entry are

  • vpn: TLB store the Virtual Page Number(VPN) as a part of TLB entry to idenfity the future lookups for a address translation.

  • pagemask: This field stores the pagemask whish is used to identify the page size of the ppn.

  • asid: This field stores the address space identifier, which facilitates addres-translation fence on a per-address scheme.

  • ppn: This field stores the physical page number, associated with the virtual address.

  • permissions: It contains set of 8 bits that stores various permission associated with the Page Table entry.

01234567VRWXUGAD11111111

Every bit of permissions tell something about the PTE it is associated with

  • V: it indicates whether the PTE is valid or not

  • R: it indicates if the PTE is readable or not

  • W: it indicate if the PTE is writeable or not

  • X: it indicate if the PTE is executable or not

  • U: it indicates whether the page is accessible to user mode or not. U-mode software may only access the page when U=1 if the SUM bit in the sstatus registeris set, supervisor mode software may also access pages with U=1

  • G: this bit designates Global mapping. Global mappings are those that exist in all address space.

  • A: it indicates whether the virtual page has been read written, or fetched from since the last time the A bit was cleared.

  • D: it indicates whether the virtual page has been written since the last time the D bit was cleared.

14.3.1.2.2. SFence operation

As mentioned in set associative TLBs, Sfence can be of two types simple_sfence or a complex_sfence . The process of complex_sfence operation is similar here just instead of invalidating bits of TLB_VALID here, the whole TLB entr is zeroed out. And for a simple_sfence operationWhen the TLB recieves a Sfence operation

14.3.1.3. Fully-Associated TLB (hypervisor)

In the presence of H extension the bevahiour of fully-associated TLB somewhat gets changes which is captured in the fa_dtlb_hypervisor and fa_itlb_hypervisor packages. Most of the functinality of the TLBs remains similar to what decribed above, we will in the following part discuss the chages done in hypervisors packages and there reson behind it.

First and formost the TLB fields get an upgrade and will be described further in the following section, next is the tag matching logic which now has to deal with hypersisor address. For the tag match logic it is only considered a success(match) if all the following conditon are true.

  1. VPN matches, where the pagemask field mask the requested vpn and this masked vpn is then matched with the vpn field of the TLB entry.

  2. Virtual bit V matches, indicating either virtualization was present and the address need a two stage translation or virtualization was off.

  3. if virtualization is present then VMID should match and if not then VMID should be zero.

  4. if virtualization is present ASID fields should match but here the asid is taken from vsatp CSR, if virtulization is no there then ASID matches or global permission bit is set, and here ASID is taken from ssatp CSR.

With inclusion of hypervisor the logic to check for transparent translation now also has to check wheather the requesst is hypervisor request(V=1) if so, then for request to be tranparent translation mode of both of the hgatp and vsatp CSR should be zero. If the request is not a hypervisor request(V=0) then the check remains same as supervisor TLB.

Next change is with the fault checking logic, if the virtulization is off i.e. the TLB request is with V=0 then the fault checking remains same as the supervisor TLB. If the request is V bit set then fault checks are performed two times one with the hpa_permissions bits and another with gpa_permissions bits, if either one of permissions bits raises a fautle then there is a fault but the cause changes depending on which permission bits raised a fault if gpa_permissions raises a fault then cause is guest_pagefault depending on the access, and if the fault is raised by hpa_permissions bits then its a cause is a normal page fault depending on the access type. Also if there is a guest page fault then according to the spec sheet we need to set the mtval2 with the guest physical address which corresponds to the virtual address witten in mtval/stval therefore along with the response to core we send GPA(guest pysical address) of the TLB entry which was a hit.

All and all the working of the tlb can be broadly classified into two types one where it recieves a hypervisor request(V=1) and another when it recieves a superrvisor request (V=0), incase where it recieves a supervisor its working remains same as a normal TLB decribed above, but incase of hpypervisor request all the above mentioned changes come under play.

14.3.1.3.1. TLB fields

TLB entry mentioned below is for a sv39, and the entry size will change in accordance with the virtualization scheme the core is running on.Below are the factor affecting the TLB entry size

  • vpnsize : Number of bits used to represent virtual page number without the page offset(12 bits). For sv39 its 27(39-12).

  • ppnsize : Number of bits used to represent physical page number. For RV64 its 44 and for RV32 its 22.

  • max_varpages : Number of page table levels. For sv39 its 3.

  • subvpn : Number of bits used to represent subvpn, max_varpages and subvpn is used to decide the lenght of pagemask field. For RV64 subvpn is of 9 bits and for RV32 its 10 bits

  • asidwidth : Number of bits used to represent ASID. For RV64 its 16 bits and for RV32 its 9 bits.

  • paddr: Number of bits used to represent physical address.

  • vaddr1: Number of bits used to represent virtual address.

TLB entry contains various fields described as follow: The fields of TLB entry are

  • vpn: TLB store the Virtual Page Number(VPN) as a part of TLB entry to idenfity the future lookups for a address translation.

  • pagemask: This field stores the pagemask whish is used to identify the page size of the ppn.

  • ppn: This field stores the physical page number, associated with the virtual address.

  • gpa: This field stores guest pysical address of the last level translation of VS stage Translation

  • gpa_permissions: This field stores the permission bit associated with the above mentioned gpa field

  • hpa: This field stores host physical address of the last level(depends on the page size) translation of G stage.

  • hpa_permissions: This field stores the permission bits associated with the above mentioned hpa field.

  • vmid: This field stores the Virtual machine ID. Used to identify which machine the address translation

corresponds to. - asid: This field stores the address space identifier, which facilitates address-translation fence on a per-address scheme. - v : This field tell wheather the response corresponds to hypervisor request(V=1) or supervisor request(V=0).

67629268ASIDPAGEMASK1618 = subvpn(9) * max_level(3)51316152PPNASID441601234567308VRWXUGADPPN1111111144

14.4. Page Table Walk(PTW)

TLBs can cache so much translation at a time even with highly efficient TLBs, misses are unavoidable. When TLB miss occur the page table is searched or “walked” to locate the translation of a given virtual address. The core implements this PTW or translation walk in hardware. You can refer to the Section 4.3.2 of RISC-V Instructino Set Manual, Volume II: Privileged Architecture, Version 1.11, for in-depth understanding of Virtual address translation process,

Currently we have two package for Page table walk one in ptwalk_rv_hypervisor and the other one in ptwalk_rv_supervisor and the selection for among them is based on the fact that is the hypervisor etension is enabled or not in the core, sepcified in the core.yaml file. For the sake of this discussion the ptwalk_rv_hypervisor will be refered as PTW_H and the ptwalk_rv_supervisor will be refered as PTW_S

The hypervisor enabled page table walk i.e ptwalk_rv_hypervisor works a little different than ptwalk_rv_supervisor, as metioned above hypervisor can have two stage address tranlation whereas supervisor has only single stage address tranlation. Both of the package can handle SVNAPOT and sends the response accordingly when conditditons for SVNAPOT is true.

Interfaces for both of the packages are same. Below is the description for the interface of PTW module.

hypervisor_ptw-interface.png:align:center

14.4.1. PTW supervisor(Working)

_images/tlbs-ptw.png

The root address of the page table saved inside a regisetrs and for a given virtual address it transverse the multi-level page table to find the Page Table Entry corresponding to the given virtual address. After the page table entry (PTE) is found check are performed on the PTE and if there is any trap or fault the core is made aware accordingly, if the PTE passes all the checks, PTW send out a response to TLB with the translated physical address and other required metadata . When both the instruction and data TLB makes a request to the PTW, the data TLB gets higher priority.

Note

When the data TLB has made a request to the PTW, the core pipeline no longer has access to the data cache until the PTW operation has completed.The core supports two type of TLBs, set-associative and fully associated.

14.4.2. PTW hypervisor(Working)

The hardware PTW with hypervisor enables is bascically a Finite state machine which aims to translate the virtual address to its physical address. The transparent tranlation process is little diffrent from the PTW_S as there are two CSR involved VSSATP and HGATP which govern VS-stage and G-stage translation respectively. There are four cases for translation

  • VSSATP.mode == 0 and HGATP.mode == 0, totally tranparent translation and response return GVA

  • VSSATP.mode == 0 and HGATP.mode != 0, tranparent translation in stage1 only

  • VSSATP.mode != 0 and HGATP.mode == 0, transparent translatoin in stage2 only

  • VSSATP.mode != 0 and HGATP.mode != 0, no transparent translation

Next we will describe the workinga of different state

14.4.2.1. STATE diagram expanation

There are 5 states in PTW_H ReSendReq, WaitForMemory, GeneratePTE, GeneratePTE_S and SwitchPTW. Below are the description of each of the state separately and at the end we will walk through a fully translation for better understanding

GeneratePTE, WaitForMemory and SwitchPTW works differently depending if the ptwalk is undergoing Stage1 translation or Stage2 translation To decide between which stage the States should work in we have used a register rg_v which is set on recieveing the request inside the ptwalk and later the gets toggled bu SwitchPTW to toggle between stage1 and stage2 . Also states samples the value to decide which stage we are in stage1/VS-stage/rg_v=1 or stage2/G-stage/rg_v=0

All the states works in a works in a uinson to tanslate a given virtual address to its corresponding physical address albeit its a supervisor request or a hypervisor request, PTW_H also handels all the cases for a transparent tranlation.

14.4.2.1.1. GeneratePTE (default state)
_images/ptw_GeneratePTE.png

The flow chart above properly encapsulates all the decisions that are taken inside this state and how does it work. The objective of this state is to genrate the address of the PTE and send a memory request for that address, but the situation becomes complex as first it has to decide which stage its working with and do the work accordingly. The generatePTE address process block uses the different set of inputs for different stages.

14.4.2.1.2. SwitchPTW
_images/ptw_SwitchPTW.png

Upon switching to this state, it decides wheather the switch happend of a G-stage or VS-stage request and toggles the stage by flipping the rg_v bit. After flipping the rg_v bit to 0 it effectivly changed the stage to a G-stage therefore it sets the rg_levels_G register according to the HGATP CSR , similarly fliping the rg_v bit to 1 it changed the stage to VS-stage, as a result it perform a implicit memory access for a given HPA which was set after partly completing the stage1 translation, and then swictes the state to WaitForMem.

14.4.2.1.3. WaitForMem
_images/ptw_WaitForMem.png

Its the most complex state of all, first the state deques the memory response and based of which stage the the FSM is in or the request is a supervisor request(hypersisor request with vssatp_mode != 0 && hgatp_mode == 0 will also this condition troughout the code) some of th fault checks differ, if a fault is detected response is sent to dcache which in turn inform the core of a fault and cause of it. The cause value is depending on the access type of the request and which stage experienced the fault, for example if stage 1 has experienced a fault the cause will a guest page fault type depending on the access type of the request whereas if the fault is experinced in stage2 then its a normal page fault.

If there are no fault then either the PTE in pointer to next level page table or its a leaf page table entry, and further decision are shown it the flow chart above.

14.4.2.1.4. GeneratePTE_S

This stage is used to completely isolate a request to the ptwalk module which expect only a single stage translation that are supervisor request(request with V bit 0) and hypervisor request with vssatp_mode != 0 and hgatp_mode == 0 i.e in non transparent translation in stage 1 and transparent translation in stage 2.

Its working are summarised below

  • segrerate the VPN bsed on the subvpn size

  • set current modes base on atp CSR

  • set max level basedn on atp mode

  • generate PTE address

  • mem request enq

  • Change state to WaitForMemory

14.4.2.1.5. ReSendReq

Used to replay the ptwalk request when there is a fault so it enq the request to DMEM and deque the exsiting ptwalk request. Finally it switches to default state if the request was from supervisor then it switches to GenratePTE_S if its not then it switches to GeneratePTE