The Process: A virtual Computer
Contents
7. The Process: A virtual Computer#
A process is a virtual computer than encapsulates CPU state for execution context, a virtual memory abstraction of massive contiguous memory that starts at address 0x0
, and state needed to interact with other processes and read/write state from file system.
7.1. The interface#
The key system calls in traditional Unix related to processes are:
pid = fork(void)
: Create a child process that is a duplicate of the parent; return 0 in child, andPID
of child in parent.exit(status)
: Terminate the calling process and record the status passed in for others.pid = waitpid(cpid, *status...)
: Wait for specified child processcpid
to complete (or change state), fill*status
with the status passed on child’s exit, and garbage collect any kernel resources; return thePID
of the child process that exited (or changed state).err = execve(program, arguments, environment)
: executing the fileprogram
with specified arguments and environment information
The fork
system call duplicates the calling process (referred to as the parent) into a new child process, where the only difference that enables the parent and child to distinguish themself is the return value. You can think of this logically as creating a copy of all the process memory, copying the CPU state, and copying the file descriptor table (while incrementing reference counts on all the files pointed to by the file descriptor table).
A process has a large amount of state associated with it. Linux implements fork
in a library on top of the lower level clone
call that lets the calling program control what part of the state is copied to the child. As we will see, this enables the same system call to be used to both create processes, and create threads within those processes.
Fork is usually quickly followed by a call to exec
to execute a new program. One of the often claimed elegant design decisions of unix is that after a fork, the child has fine grained control to change file descriptor state before calling exec
; that is, the same functionality that is used for normal access to files is used by the shell between fork and exec to set up he child.
Some of the authors have argued that fork was a clever hack for machines and programs of the 1970s that has long outlived its usefulness and is now a liability [BAKR19]. Many operating systems provide powerful primitives to control the launching of new programs without an intervening fork
, and fork imposes design constraints and overheads that these alternatives don’t have. Even on Linux for many use cases, posix_spawn
combines the functionality of fork
and exec
. However, posix_spawn
is on Linux implemented on top of clone, so it doesn’t actually alleviate the author’s concerns.
There are actually many more interfaces supported by systems like Linux related to process management. You can, for example, send a signal (e.g., to stop or kill) a process using:
kill(pid_t pid, int sig)
: Send a signal to a process or group of processes
Some of these interfaces are actual system calls and are in section 2 of the man pages, others are implemented in a library; you can, for example type man -k exec
to find out which man pages relate to exec. If you type man 3 exec
you will find a whole series of higher level functions built on top of execve
.
7.2. Process versus Program#
It is important to understand the difference between a program and a process. A program is a executable file, generated by a compiler and linker, that can be executed using exec
. The exec
system call then populates the memory region of the process with information pulled out of that program and there can be many processes, by many users, running the same program.
To understand how the portions of the executable file are used by the process, consider the following trivial program that runs an infinite loop so we can examine it while it is running.
1int main() {
2 while(1);
3}
If we compile this program, run it, I can use the synthetic file system “/proc/PID/maps” to ask Linux about what is mapped in different parts of the application address space. Node, in bash the variable $!
is the pid of the last started process.
Note
Fields when looking at /proc/PID/maps:
address: This is the starting and ending address of the region in the process’s address space
permissions - This describes how pages in the region can be accessed. There are four different permissions: read, write, execute, and shared. If a region is not shared, it is private, so a p will appear instead of an s.
offset - If the region was mapped from a file (using mmap), this is the offset in the file where the mapping begins. If the memory was not mapped from a file, it’s just 0.
device - If the region was mapped from a file, this is the major and minor device number (in hex) where the file lives.
inode - If the region was mapped from a file, this is the file number.
pathname - If the region was mapped from a file, this is the name of the file. Otherwise, for special regions like the heap, stack… this field identifies the use.
We can see that the first five regions of the process come directly from the file. We can use the objdump
utility to print the different sections of the executable file, as shown below, where the second column Name
describes that part of the file, and the fourth column VMA
is the virtual memory address that should be loaded from the program. If you scroll through that information, you can see that the executable parts of the program (.init
, .plt
and .text
) is loaded into the memory with executable permission. The .rodata
is mapped into memory right after that, etc… All these regions have the p, or private, flag set in the permissions on the map, meaning that the operating system is creating a copy of the information from the file, so many processes can run the same program, and any writes they make to the corresponding memory will not be visible to other processes executing the same program.