14.1. Abstracting a useful interface for memory management.#

One of the fundamental jobs of the memory management system is to provide a useful abstraction to the system’s physical memory. This abstraction must be conducive to writing and running applications without needing to know the details of the physical memory. The MM abstraction must provide a consistent virtual interface to physical memory and provide a set of operating systems services that create, delete, expand, contract and manipulate regions of virtual memory that the running programs use.

14.1.1. The Virtual Interface#

The virtual interface must transparently hide the details of the actual physical memory on the system and provide the same interface to every running process. This interface or “Virtual Address Space’ should appear as an array bytes and start at the same location for every process. In the ideal world virtual address space provided by the operating system would be identical for every process and provide a set of functions that allow programs to manipulate it.

14.1.2. Basic Memory Management functions#

As mentioned earlier the memory management systems must provide a set of operating systems services that create, delete, expand, contract and manipulate regions of virtual memory that the running programs use. These memory management functions provided to user programs can be divided into 2 categories: library calls and system calls.

14.1.2.1. Library calls#

The most basic memory management library functions to allocate and free virtual memory are malloc() and free() respectively.

1.) malloc(): The malloc library function use is “address=malloc(size)” where you pass the number of bytes you want to allocate in size and malloc returns the virtual address of the memory allocated into address.

2.) free(): The free library function use is “status=free(address)” where address is the virtual address malloc() returned to the user and status is simply an indication of success or failure.

14.1.2.2. system calls#

The memory management system calls are sbrk(), brk(), mmap(), munmap() and mprotect().

1.) sbrk(): The sbrk() system call is used to expand and contract the end of the program’s data section. The use is prev=sbrk(size)” where size is the positive or negative number of bytes to expand or extract the end of the data section and prev is the end of the data section before the call.

2.) brk(): The brk() system call is used to explicitly set the end of the program’s data section. The use is “status=brk(address)” where address is where to set the end of the data section and status is the success/failure of the call.

3.) mmap(): The mmap() system call is used to allocate anonymous memory in page sized increments into a contiguous set of virtual pages or map the contents of a file segment into a set virtual pages. The use is “address=mmap(arguments, …)” where the arguments instruct whether to allocate anonymous memory or map a file section and address is the page aligned virtual address of the newly allocated memory.

4.) munmap(): The munmap() system call is used free anonymous memory allocated via mmap() or unmap the contents of a file segment that was mapped via mmap(). The use is “status=munmap(address, size)” where address is the page aligned virtual address to be unmapped, size is the size of memory to be unmapped and status is the success of the call.

5.) mprotect(): The mprotect() system call is used to set the protection for a range of valid virtual pages. The use is “status=mprotect(address, size, protection)” where address is the base address and size is the location and size of the virtual region that gets the new protection.

14.1.3. Physical Memory/No Virtual Memory#

On a system without virtual memory there is one physical address space and programs are loaded into physical memory, allocate physical memory as needed and directly access those physical memory locations or addresses. If more than one program is running at the same time they are simply loaded into a different physical address and allocate different physical addresses. This means every program must either know the physical address that it is running or it must be relocated when it is loaded.

14.1.4. Virtual Memory#

What is most desirable and usable is a virtual memory model where every program would run in a process that has a unique but identical virtual address space. A virtual address space would ideally start at address zero and extend up to some very large address. Since every process appears to have the same virtual address space programs can be loaded at the same virtual address and therefore do not need to to be relocated. There are 2 basic virtual memory models: Segmented Virtual Memory and Paged Virtual Memory.

14.1.4.1. Segmented Virtual Memory#

A simple segmented virtual address space starts at location zero and extend up to some upper limit. Its implemented in hardware using base and offset registers. Every memory reference is virtual and is translated to a physical address in hardware by adding the base register to the virtual address and qualified by insuring the result is less than the offset register.

14.1.4.2. Paged Virtual Memory#

A paged virtual memory model splits both the virtual address space and physical memory into fixed size pieces knows as virtual pages and physical page frames respectively. The mapping of virtual pages to physical page frames is accomplished via an array of page frame numbers known as a page table. There is one page table entry for each virtual page and only then pages that are being used contain valid page table entries and therefore map physical page frames.

14.1.4.2.1. The virtual address space#

The virtual address space of every process is typically very large and fixed in size, for example the virtual address space for every process running on an x86_64 system is 2^48 or 258TB. By default the entire virtual address space is not accessible to programs running in it. The only portions or regions that are accessible are those that the program requested via system calls(mmap and sbrk) and the kernel granted access to. So the virtual address space is sparsely populated with valid regions and those are either anonymous regions or file-backed regions. Anonymous regions are any region that is not backed by a file, for example; stacks, heaps, BSS, malloc()’d and sbrk()’d regions. File-backed regions are sections of a file that are mapped into the virtual address space. Once the valid regions have been created physical memory pages are mapped into the virtual pages on demand therefore those regions are sparsely populated with physical memory.

Paged virtual memory aka paging introduces the possibility of several additional highly desirable features or pros to the memory management system but also introduces several additional challenges or cons.

14.1.4.2.2. Pros#

  • Several huge virtual address spaces can be much larger than physical memory via over-committing physical memory.

  • Total isolation of each virtual address space from all others.

  • Sharing of selective regions of memory between different virtual address spaces is easy.

  • Use of memory pages to cache file system data and meta-data is possible.

  • Very large and sparse virtual address space requires page fault logic that maps physical memory on-demand.

14.1.4.2.3. Cons#

  • Over-committing physical memory requires sophisticated page reclaim logic.

  • Supporting multiple page sizes on different architectures complicates an already complex memory management system.

  • Very large virtual address space requires large and complex multi-level page table design.

The next chapter describes…