This chapter describes a pages virtual memory system works, what the MMU is and how it works and how we over commit physical memory using it.

14.2. Paging#

14.2.1. Paged Address Translation#

Most virtual memory systems today use a paged address translation AKA paging. A paged virtual memory system divides both virtual memory and physical memory into relatively small fixed size chunks pieces. They are known as virtual pages or just pages in the virtual address space and page frames in the physical memory space. The memory management system maintains a mapping for each virtual page in use to a page frame that contains the actual memory contents.

../_images/pagedmem.png

Fig. 14.1 Paged virtual memory example#

Paging solves a couple major problems that segmentation has: 1.) The paged virtual address space can be much larger than physical memory. 2.) Paging eliminates the need to maintain large regions of physically contiguous memory.

In the segmentation memory model its not possible to run a program that’s requires more physical memory than the system has. This means a program may run on a system with a large amount of memory and not be able to run on a system with a smaller amount of memory at all.

Another benefit of paging over segmentation is that a region of virtual memory does not need to map physically contiguous memory. Since the virtual address space is divided into small fixed size pages a virtually contiguous region of memory can map several page frames that are randomly scattered throughout the entire physical memory.

14.2.1.1. The virtual address space with Paging#

On a paged virtual memory system every process has exactly the same size virtual address space, gigabytes, terabytes or even petabytes on modern systems is typical. By default none of the virtual pages within this huge address space is valid when a process and it’s associated virtual address space is created and therefore can not be referenced. In order to make regions of the virtual address space valid either part of a file must be mapped into a virtual region or virtual memory must be dynamically allocated creating a virtual region. Since the virtual regions are relatively small the huge virtual address space is sparsely populated with small valid virtual regions. As we discussed earlier in this book a process and its associated virtual address space is created when a child process is created via fork(). At this time the valid virtual regions are duplicated from the parent process to the child process. The regions within the virtual address space can also be eliminated and new ones created when the process runs a different program via exec().

When the virtual regions are initially created or the virtual address is changed no page frames are mapped into the virtual pages within them, however a program can legally access pages within the virtual regions. When this happens the memory management system responds by mapping a physical page into each virtual page as it is accessed or on demand. This means the virtual regions are sparsely populated with physical memory pages, similar to the way the entire virtual address space is sparsely populated with valid virtual regions. As a process runs on a paged virtual memory system only a small subset of physical page frames are required to be mapped into virtual memory pages at a given time.

The paged virtual memory management system is responsible for multiplexing the page frames and insuring that at least the necessary virtual to physical mappings are in place. On systems with larger amounts of physical memory more page frames can be mapped into virtual pages than systems with smaller amounts of physical memory. This means a program running on a small memory system probably just runs slower than on systems with large amounts rather than not being able to run at all. This is a huge improvement over segmented systems where some programs can not run at all on systems without adequate memory.

Another benefit of paging over segmentation is that a region of virtual memory does not need to map physically contiguous page frames. Since the virtual address space is divided into small fixed size pages a virtually contiguous region can map page frames that are randomly scattered throughout the entire physical memory. This eliminates the need to maintain physically contiguous memory, an overhead that is both complex and very CPU intensive. In addition, it completely eliminates external fragmentation, further eliminating the need to perform compaction. The only fragmentation issue encountered in a page virtual memory system is internal fragmentation. An internal fragment is the unused portion of a page at the end of a region of virtual memory. An internal fragment which is only half of one page on the average is typically much smaller than the external fragment associated with segmentation can not be reclaimed by the memory management system.

14.2.2. MMU Memory Management Unit#

Every time a virtual address is referenced it must be translated to a physical address before presenting it to the memory address bus hardware. This would be unacceptably slow so on paged systems the CPU includes special hardware that maintains the virtual to physical page mappings for each page. This CPU hardware is known as the Memory Management Unit or MMU.

We examine a single model of address translation in detail: the one used by the original Pentium, and by any Intel-compatible CPU running in 32-bit mode. It uses 32-bit virtual addresses, 32-bit physical addresses, and a page size of 4096 bytes. Since pages are \(2^{12}\) bytes each, addresses can be divided into 20-bit page numbers and 12-bit offsets within each page, as shown in Fig. 14.3.

../_images/virt-mem-pic10.png

Fig. 14.2 Example of virtual address mapping to physical address#

The Memory Management Unit (MMU) maps a 20-bit virtual page number to a 20-bit physical page number; the offset can pass through unchanged, as shown in Fig. 14.2, giving the physical address the CPU should access.

../_images/virt-mem-pic9.png

Fig. 14.3 Page number and offset in 32-bit paged translation with 4KB pages#

Although paged address translation is far more flexible than base and bounds registers found in segmentation, it requires much more information. Base and bounds translation only requires two values, which can easily be held in registers in the MMU. In contrast, paged translation must be able to handle a separate mapping value for each of thousands or even million virtual pages(although most programs will only map a fraction of those pages). The only possible place to store the amount of information required by paged address translation is in tables located in memory itself. These memory resident tables of virtual to physical translations are known as page tables. The MMU uses the in-memory page tables or mappings to translate from every virtual address to the actual physical addresses.

14.2.2.1. Memory Over-Commitment and Paging#

Page faults allow data to be dynamically fetched into memory when it is needed, in the same way that the CPU dynamically fetches data from memory into the cache. This allows the operating system to over-commit memory: the sum of all process address spaces can add up to more memory than is available, although the total amount of memory mapped at any point in time must fit into RAM. This means that when a page fault occurs and a page is allocated to a process, another page (from that or another process) may need to be evicted from memory.

There are two types of regions in a user a user’s virtual address space: file-backed and anonymous.

File-backed regions are contiguous portions of a file on storage that is mapped into a user’s virtual address space. For example the exec()system call mmap()’s the text and data section of an executable file into a users address space at predetermined virtual addresses. The majority of the page frames mapped into file backed regions are read-only mappings and therefore are never modified so never need to be flushed or written back to storage. When a process exits and munmap()s its file-backed regions nothing is deleted. Everything is from some file that exist on some storage device.

Anonymous memory regions do not correspond to a file. Instead, they are the data, heap and stack regions of the user’s virtual address space. Unlike file-backed regions the page frames mapped into anonymous regions are almost always modified. Consider what would be the sense of allocating from a heap and only reading from it? For this reason all page frames mapped into anonymous regions must be written to some storage location if it is to be reclaimed. Also unlike file-backed regions all the page frames mapped into anonymous regions are deleted and therefor immediately freed. The data, heap and stack contents of a process that is exiting is on no value to any other process and therefor can be eliminated

Evicting a read-only page mapped from a file is simple: just forget the mapping and free the page; if a fault for that page occurs later, the page can be read back from disk. Occasionally pages are mapped read/write from a file, when a program explicitly requests it with mmap—in that case the OS can write any modified data back to the file and then evict the page; again it can be paged back from disk if needed again.

Anonymous segments such as stack and heap are typically created in memory and do not need to be swapped; however if the system runs low on memory it may evict anonymous pages owned by idle processes, in order to give more memory to the currently-running ones. To do this the OS allocates a location in “swap space” on disk: typically a dedicated swap partition in Linux, and the PAGEFILE.sys and /var/vm/swapfile files in Windows and OSX respectively. The data must first be written out to that location, then the OS can store the page-to-location mapping and release the memory page.

Hint: Linux uses/borrows the page table entry to store the location of a swapped out page.
As long as the PTE present bit is not set the MMU hardware will not attempt to translate, instead it will cause a page fault. The entire PTE can be used by software when the present bit is not set.

../_images/virt-mem-pic106.png

Fig. 14.4 Page table entry with D (dirty) bit#

14.2.3. Paging - Avoiding Fragmentation#

The fragmentation in Fig. 13.13 is termed external fragmentation, because the memory wasted is external to the regions allocated. This situation can be avoided by compacting memory—moving existing allocations around, thereby consolidating multiple blocks of free memory into a single large chunk. This is a slow process, requiring processes to be paused, large amounts of memory to be copied, and base+bounds registers modified to point to new locations[^2].

../_images/virt-mem-map.png

Fig. 14.5 A mapping between virtual pages and physical pages#

Instead, modern CPUs use paged address translation, which divides the physical and virtual memory spaces into fixed-sized pages, typically 4KB, and provides a flexible mapping between virtual and physical pages, as shown in Fig. 14.5. The operating system can then maintain a list of free physical pages, and allocate them as needed. Because any combination of physical pages may be used for an allocation request, there is no external fragmentation, and a request will not fail as long as there are enough free physical pages to fulfill it.

14.2.3.1. Internal Fragmentation#

Paging solves the problem of external fragmentation, there is no wasted space between pages of virtual memory. However, paging does suffer from another issue, internal fragmentation, space may be wasted inside the allocated pages. E.g. if 10 KB of memory is allocated in 4KB pages, 3 pages (a total of 12 KB) are allocated, and 2KB is wasted. To allocate hundreds of KB in pages of 4KB this is a minor overhead: about \(\frac{1}{2}\) a page, or 2 KB, wasted per allocation. But internal fragmentation makes this approach inefficient for very small allocations (e.g. the new operator in C++), as shown in Fig. 14.6. (It is also one reason why even though most CPUs support multi-megabyte or even multi-gigabyte “huge” pages, which are slightly more efficient than 4 KB pages, they are rarely used.)

../_images/virt-mem-pic7.png

Fig. 14.6 Example of internal fragmentation#