---
title: Processes
subtitle: Lecture 1, course in operating systems
tag: EDA093
...

- We run several processes at the same time 
	- false, it feels like we run several processes at the same time
- One CPU can only run one program at the same time
	- false, each CPU core can only run one process at the same time
    
# Context Switching

Is about saving information about the current process, and then
loading new information about the new process in order to
switch from executing / running one to the other.

We save the information about the processes into something called
the process context block (PCB). This is what we load from and to.
There are two primary states a process can be in, which is either
executing or idleing. The scheduler (process that determines the
process schedule, when what process should run or idle) can only pick
idle proccess, not processes that are already executing.

## Process Concept

A process is a program in execution, process execution progresses in
a sequential fashion. A program is something passive stored on disk,
it is in some way the blueprint for what the process should be like.
The blueprint (program) can create multiple processes, not just one.

## Process Memory Layout

![Process Memory Layout](/postfiles/tirK8/img1.svg)

The memory of the process consists of multiple parts.

- Text Section --- Where the machine instructions are stored.
- Data Section --- Where the global variables are stored.
- Heap --- Where dynamically allocated memory is stored.
- Stack --- Where temporary data is stored like function parameters, return addresses etc.

## Diagram of Process State

![Process State Transition Diagram](/postfiles/tirK8/img2.svg)

When a process is executing, depending on what instructions it executes,
it may change the state its in from the viewpoint of the operating system.
The possible states it can be in are the following.

- New --- The process is in the process of being created.
- Ready --- The process has been created and is wating to be assigned to a processor to start executing.
- Running --- The instructions which the program specified for the process are being executed.
- Waiting --- The process executed some blocking instruction, like an I/O operation, thus it is waiting for that event to occur / complete.
- Terminated --- The process stopped executing.

While you wait for things like I/O operations, the operating system may run
other things in the meantime, this is to save time instead of just doing nothing.

## Context Switching and the Process Context Block

When a CPU switches between executing different processes, the operating system
must in some way save the state the process was in at the instant it was switched away
from in order to properly resume it as if it just kept going. The context is
saved in the so called PCB. Context-switching is a time costly procedure so the operating
system should want to minimize the number of said switches. Context-switching
speed may also vary a lot on hardware, where some hardware provides multiple
set of registers on the CPU in order to accomodate multiple context switches
at once.

One very important example of information that the PCB stores is the
program counter (PC) of the process. This to know the next instruction
the process should execute.

## Process Scheduling

The datastructures you pick has a lot to do with the time and space
complexity of the algorithm that is using the datastructure. Usually,
you want to optimize the datastructure in accordance with how the algorithm
accesses the data. The question then becomes, what is the datastructure of the PCB?

You could try to have one list of all the PCBs and perform linear search
on the list to find PCBs that state that the process is ready and then
execute them. However, in practice this would be very inefficient and slow.

What you do instead is that you have multiple "lists" which are essentially
queues.

## Ready Queue and Various I/O Device Queues

Having a queue removes the need to search, you just pick what's up
next in the queue. You also use multiple organized lists / queues
that are for things like waiting lines to different I/O devices etc.

Note: There can be multiple PCBs reffering to the same process
present in multiple queues at once. However, the smallest granularity
is for one thread. That is, a thread, may not be in multiple queues
at the same time.

## The Scheduler

The scheduler is a process that belongs to the operating system. There
are multiples types of schedulers.

- Short-term scheduler / CPU scheduler
	- Selects which processes should be executed next and allocates
    CPU for said processes
    - Invoked very frequently (milliseconds), therefore must be very efficient / fast
- Long-term scheduler / job scheduler
	- Invoked pretty infrequently (seconds), does not need to be very optimized
    - Controls the degree of multiprogramming
    
# Questions

**The instruction to be run next when process
X is scheduled is maintained in?**

1. The text section
2. The data section
3. The process control block

Well, you could say that its maintained in the PCB because the PCB is
what stores the program counter which tells you which instruction to
execute. However, the text section is what stores the actual instruction
itself.

**Which of the following state switch is not
valid?**

1. Ready $\rightarrow$ Running
2. Running $\rightarrow$ Ready
3. Running $\rightarrow$ Waiting
4. Ready $\rightarrow$ Waiting

Number 4 is not valid since to be Ready means that the process was just
created and has not entered a state of Running / Executing yet. Therefore it
could not have executed an instruction which would have transitioned into
a Waiting state.

**During a context switch, what is saved by
the OS (to be later re-loaded) is?**

1. The PCB of the process
2. The PCB and the memory of the process
3. The memory of the process

Number 1, the PCB, it stores all the necessary data in order to
perform a context switch.

## Operations on Processes

The operating system must provide basic operations such as

- Process creation
- Process termination
- ...

These by themselves are simple concepts, the complexity comes from the
fact that you need a process to run in the first place (operating system)
to create a new process, but how does the switch happen? How do you
hand execution to another process and then switch back to the operating system
in order to switch to another process.

### Process Creation

Processes are organized in a tree strucuture, that is, there exists
a root process from which everything else branches from. Parent processes
create child processes, which in turn, become parents to their children.
A process is identified by its PID (process id). These parent and child processes
have resources, but they are different processes, so one must decide on some
schema for how the resources (memory, file descriptors etc) are shared. The options
you could pick between are

- Parent and child share all resources
- Children share a subset of the parents resources
- Parent and child share no resources at all

There are options you could pick between regarding how parent and child
execute their instructions.

- Parent and children execute concurrently
- Parent wait (blocks) until child terminates.

As said before, processes are organized in a tree structure. If you were
to run the command `ps -el` or `pstree -el` you would get something like this.

```txt
XXXXXXXXXX:~$ pstree -pl
init(1)-+-NetworkManager(1215)-+-dhclient(3386)
| |-dnsmasq(3390)
| |-{NetworkManager}(1224)
| `-{NetworkManager}(2164)
|-accounts-daemon(1527)---{accounts-daemon}(1530)
|-acpid(1360)
|-apache2(16861)-+-apache2(2458)
| |-apache2(2460)
| |-apache2(2461)
| |-apache2(2462)
| |-apache2(2463)
| |-apache2(2464)
| |-apache2(4079)
| |-apache2(4082)
| |-apache2(5464)
| |-apache2(5465)
| `-apache2(5466)
|-at-spi-bus-laun(3570)-+-dbus-daemon(3576)
...
```

which prints out the process hierarchy.

#### Forking

The `fork()` system call creates new proccess by copying / duplicating the proccess
which called `fork()`. The copy and original process are identical in almost every
single way, but they are still seperate processes and treated as completely different
by the operating system. Forking will make the process making the system
call go into waiting, this because it needs to wait for the operating system
to duplicate itself before it can continue executing.

Once done, the process is copied and the original process executes
as if nothing happened.

```c
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>

int main() {
	pid_t pid;
   	
    /* fork the execution of the process here */
    pid = fork();
   	
    if(pid < 0) {
        /* pid is zero or a positive integer, if its less than one,
           something went wrong
         */
    	printf("fork failed\n");
        return EXIT_FAILURE;
    } else if(pid == 0) {
   		/* the copy of the process which has the value of 0 for the
           pid variable is the child process, we check if we are the
           child, if we are, execute something
        */
    	execlp("/bin/ls", "ls", NULL);
    } else {
    	/* if the pid is not zero, then we are the parent, in which
           case we simply in this case wait for the child do complete
           whatever thing it wanted to do
        */
    	wait(NULL);
        printf("child complete\n");
    }
    
    return 0;
}
```

Just as a note, the process id for the child is not actually `0` from
the perspective of the operating system or the parent process. The value for
`pid` from the perspective of the parent is the actual `pid` of the child. However,
from the perspective of the child its `0`, this is just so that the child can know
that it is the child.

### Process Termination

The operating system essentially just decides that the process should
no longer execute instructions and therefore cleans up all the data
related to the process and stop executing it. Some operating systems
do not allow that child processes of a terminated parent process to 
keep executing. This is called cascading termination, terminating the
parent consequently terminates all processes it has spawned. The parent
process may use the `wait()` system call in order to wait for all child
processes to terminate before terminating. This system call returns status
information and the pid of the terminated process.

#### Zombies

A child can become a "zombie" when the parent does not `wait()`
to reap the child process.

#### Orphans

An orphan process is when the parent process has terminated
before the child process and the child process keeps on executing.

# Multiprocess Architecture

Many web-browsers like Chrome use a multiprocess architecture where
each tab in the browser is its own independent process. This is
somewhat of a perfect usecase of `fork()` because each tab is indentical
when created but become different as time goes on. The big advantage
to multiprocess architecture is seen in the browser, that is, if one
tab were to crash / freeze then none of the other tabs would be affected.
If they were all running under one process, then one tab could mess all of the
other ones upp. It can also potentially introduce security issues.

# Interprocess Communication (IPC)

There are two main types of interprocess communication.

- Message Passing (via the operating system)
	- More reliable, managed by the operating system (assumed to be reliable)
    - Takes more time, is slower, requires system calls
- Shared Memory
	- Faster, does not require system calls
    - Less secure, a lot depends on the programmer, synchronization issues could occur etc.
    
## Pipes

One very common type of IPC is the pipe.

```txt
ls | grep
```

Here we create a pipe from `ls` to `grep` by putting `|` inbetween.
The stdout (standard output) from `ls` is `pipe`ed into the stdin (standard input)
of the `grep` command.

## Example : Ordinary Pipes

```c
int main(void) {
	char write_msg[BUFFER_SIZE] = "Greetings";
    char read_msg[BUFFER_SIZE];
    
    pid_t pid;
    
    int fd[2];
     
    if(pipe(fd) == -1) {
    	printf("pipe could not be created\n");
        return EXIT_FAILURE;
    }
    
    pid = fork();
    
    if(pid < 0) {
    	printf("fork failed");
        return EXIT_FAILURE;
    } else if(pid > 0) {
    	close(fd[READ_END]);
        write(fd[WRITE_END], write_msg, strlen(write_msg)+1);
        close(fd[WRITE_END]);
    } else {
    	close(fd[WRITE_END]);
        read(fd[READ_END], read_msg, BUFFER_SIZE);
        printf("read: %s\n", read_msg);
        close(fd[READ_END]);
    }
    
    return 0;
}
```

## Questions

```c
pid = fork();

if(pid < 0) {
	// instruction 1
} else if(pid == 0) {
	// instruction 2
} else {
	// instruction 3
}
```


**Which of the following affirmation is false?**

1. Instruction 1 can be executed by the parent and the child process
2. Instruction 2 can be executed by the parent process
3. Instruction 3 can be executed by the child process

Option 1 is false because you enter into `if(pid < 0)` it means that
the fork failed, so there is no child process. Option 2 is false because
`pid == 0` would only be true if the child was executing the code, not the parent.
Option 3 is true.

**Which type of pipes requires a parent-child
relation between two processes?**

Ordinary, named pipes are for processes that do not "inherently know of eachother".

```c
#include <stdio.h>
#include <unistd.h>

int main() {
    for(int i = 0; i < 4; i++) {
    	fork();
    }
    
    return 0;
}
```

**Including the initial parent process, how many processes are created by this code?**

16 processes are created, since each fork will also start forking.
