46
PROCESS AND THREADS

PROCESS AND THREADS. Approaches to Concurrency Processes Hard to share resources: Easy to avoid unintended sharing High overhead in adding/removing

Embed Size (px)

Citation preview

Page 1: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

PROCESS AND THREADS

Page 2: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Approaches to Concurrency

Processes Hard to share resources: Easy to avoid unintended sharing High overhead in adding/removing clients

Threads Easy to share resources: Perhaps too easy Medium overhead Not much control over scheduling policies Difficult to debug

Event orderings not repeatable I/O Multiplexing

Tedious and low level Total control over scheduling Very low overhead Cannot create as fine grained a level of concurrency Does not make use of multi-core

Page 3: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

But first,

What is a process? Some review of the functions we learned (e.g., fork, waitpid)

What is a thread?

Page 4: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Processes

Definition: A process is an instance of a running program. One of the most profound ideas in computer science Not the same as “program” or “processor”

Process provides each program with two key abstrac-tions: Logical control flow

Each program seems to have exclusive use of the CPU Private virtual address space

Each program seems to have exclusive use of main memory

How are these Illusions maintained? Process executions interleaved (multitasking) or run on sepa-

rate cores Address spaces managed by virtual memory system

Page 5: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Kernel virtual memory(code, data, heap, stack)

Memory mapped region forshared libraries

Run-time heap(created by malloc)

User stack(created at runtime)

0

%esp (stack pointer)

Memoryinvisible touser code

brk

0x08048000 (32)0x00400000 (64)

Read/write segment(.data, .bss)

Read-only segment(.init, .text, .rodata)

Loaded from the executable file

Page 6: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Concurrent Processes

Two processes run concurrently (are concur-rent) if their flows overlap in time

Otherwise, they are sequential Examples (running on single core):

Concurrent: A & B, A & C Sequential: B & C

Process A Process B Process C

Time

Page 7: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

User View of Concurrent Processes

Control flows for concurrent processes are phys-ically disjoint in time

However, we can think of concurrent processes are running in parallel with each other

Process A Process B Process C

Time

Page 8: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

A View of a Process

Process = process context + code, data, and stack

shared libraries

run-time heap

0

read/write data

Program context: Data registers Stack pointer (SP) Program counter (PC)Kernel context: VM structures Descriptor table brk pointer

Code, data, and stack

read-only code/data

stackSP

PC

brk

Process context

Page 9: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Context Switching

Processes are managed by a shared chunk of OS code called the kernel Important: the kernel is not a separate process, but rather runs

as part of some user process Control flow passes from one process to another via a con-

text switchProcess A Process B

user code

kernel code

user code

kernel code

user code

context switch

context switch

Time

Page 10: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

fork: Creating New Processes

int fork(void) creates a new process (child process) that is

identical to the calling process (parent process) returns 0 to the child process returns child’s pid to the parent process

Fork is interesting (and often confusing) be-cause it is called once but returns twice

pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}

Page 11: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Understanding fork

pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}

Process npid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}

Child Process m

pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}

pid = m

pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}

pid = 0

pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}

pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}

hello from parent hello from childWhich one is first?

Page 12: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Fork Example #1

void fork1(){ int x = 1; pid_t pid = fork(); if (pid == 0) {

printf("Child has x = %d\n", ++x); } else {

printf("Parent has x = %d\n", --x); } printf("Bye from process %d with x = %d\n", getpid(), x);}

Parent and child both run same code Distinguish parent from child by return value from fork

Start with same state, but each has private copy Including shared output file descriptor Relative ordering of their print statements undefined

Page 13: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Fork Example #2

void fork2(){ printf("L0\n"); fork(); printf("L1\n"); fork(); printf("Bye\n");}

Both parent and child can continue forking

L0 L1

L1

Bye

Bye

Bye

Bye

Page 14: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Fork Example #2

Both parent and child can continue forkingvoid fork4(){ printf("L0\n"); if (fork() != 0) {

printf("L1\n"); if (fork() != 0) { printf("L2\n"); fork();}

} printf("Bye\n");}

L0 L1

Bye

L2

Bye

Bye

Bye

Page 15: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

exit: Ending a process

void exit(int status) exits a process

Normally return with status 0 atexit() registers functions to be executed upon exit

void cleanup(void) { printf("cleaning up\n");}

void fork6() { atexit(cleanup); fork(); exit(0);}

Page 16: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

File Descriptors and Descriptor Table

fd 0fd 1fd 2fd 3fd 4

Descriptor table(one table

per process)

Open file table (shared by

all processes)

v-node table(shared by

all processes)

File pos

refcnt=1

...File pos

refcnt=1...

stderrstdoutstdin File access

...

File size

File type

File access

...

File size

File type

File A

File B

Page 17: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

fd 0fd 1fd 2fd 3fd 4

Descriptor tables Open file table (shared by

all processes)

v-node table(shared by

all processes)

File posrefcnt=2

...

File posrefcnt=2

...

Parent's table

fd 0fd 1fd 2fd 3fd 4

Child's table

File access

...

File sizeFile type

File access...

File sizeFile type

File A

File B

After Fork.

Page 18: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

fd 0fd 1fd 2fd 3fd 4

Descriptor table(one table

per process)

Open file table (shared by

all processes)

v-node table(shared by

all processes)

File posrefcnt=1

...

File posrefcnt=1

...

File access

...

File sizeFile type

File A

File B

File sharing: e.g., Opening the same file twice.

Page 19: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Zombies

Idea When process terminates, still consumes system re-

sources Various tables maintained by OS

Called a “zombie” Living corpse, half alive and half dead

Reaping Performed by parent on terminated child Parent is given exit status information Kernel discards process

What if parent doesn’t reap? If any parent terminates without reaping a child, then

child will be reaped by init process So, only need explicit reaping in long-running processes

e.g., shells and servers

Page 20: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

linux> ./forks 7 &[1] 6639Running Parent, PID = 6639Terminating Child, PID = 6640linux> ps PID TTY TIME CMD 6585 ttyp9 00:00:00 tcsh 6639 ttyp9 00:00:03 forks 6640 ttyp9 00:00:00 forks <defunct> 6641 ttyp9 00:00:00 pslinux> kill 6639[1] Terminatedlinux> ps PID TTY TIME CMD 6585 ttyp9 00:00:00 tcsh 6642 ttyp9 00:00:00 ps

Zombie Example

ps shows child process as “defunct”

Killing parent allows child to be reaped by init

void fork7(){ if (fork() == 0) {

/* Child */printf("Terminating Child, PID = %d\n", getpid());exit(0);

} else {printf("Running Parent, PID = %d\n", getpid());while (1) ; /* Infinite loop */

}}

Page 21: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

wait: Synchronizing with Children

int wait(int *child_status) suspends current process until one of its children

terminates return value is the pid of the child process that

terminated if child_status != NULL, then the object it

points to will be set to a status indicating why the child process terminated

Page 22: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

wait: Synchronizing with Children

void fork9() { int child_status;

if (fork() == 0) { printf("HC: hello from child\n"); } else { printf("HP: hello from parent\n"); wait(&child_status); printf("CT: child has terminated\n"); } printf("Bye\n"); exit();}

HP

HC Bye

CT Bye

Page 23: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

wait() Example

If multiple children completed, will take in arbitrary order Can use macros WIFEXITED and WEXITSTATUS to get information

about exit statusvoid fork10(){ pid_t pid[N]; int i; int child_status; for (i = 0; i < N; i++)

if ((pid[i] = fork()) == 0) exit(100+i); /* Child */

for (i = 0; i < N; i++) {pid_t wpid = wait(&child_status);if (WIFEXITED(child_status)) printf("Child %d terminated with exit status %d\n",

wpid, WEXITSTATUS(child_status));else printf("Child %d terminate abnormally\n", wpid);

}}

Page 24: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

waitpid(): Waiting for a Specific Process

waitpid(pid, &status, options) suspends current process until specific

process terminatevoid fork11(){ pid_t pid[N]; int i; int child_status; for (i = 0; i < N; i++)

if ((pid[i] = fork()) == 0) exit(100+i); /* Child */

for (i = N-1; i >= 0; i--) {pid_t wpid = waitpid(pid[i], &child_status, 0);if (WIFEXITED(child_status)) printf("Child %d terminated with exit status %d\n",

wpid, WEXITSTATUS(child_status));else printf("Child %d terminated abnormally\n", wpid);

}}

Page 25: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Summary

Processes At any given time, system has multiple active processes Only one can execute at a time on a single core, though Each process appears to have total control of

processor + private memory space

Page 26: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Summary (cont.)

Spawning processes Call fork One call, two returns

Process completion Call exit One call, no return

Reaping and waiting for Processes Call wait or waitpid

Page 27: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Approaches to Concurrency

Processes Hard to share resources: Easy to avoid unintended sharing High overhead in adding/removing clients

Threads Easy to share resources: Perhaps too easy Medium overhead Not much control over scheduling policies Difficult to debug

Event orderings not repeatable I/O Multiplexing

Tedious and low level Total control over scheduling Very low overhead Cannot create as fine grained a level of concurrency Does not make use of multi-core

Page 28: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Approach #3: Multiple Threads

Very similar to approach #1 (multiple pro-cesses) but, with threads instead of processes

Page 29: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Traditional View of a Process

Process = process context + code, data, and stack

shared libraries

run-time heap

0

read/write data

Program context: Data registers Condition codes Stack pointer (SP) Program counter (PC)Kernel context: VM structures Descriptor table brk pointer

Code, data, and stack

read-only code/data

stackSP

PC

brk

Process context

Page 30: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Alternate View of a Process

Process = thread + code, data, and kernel context

shared libraries

run-time heap

0

read/write dataThread context: Data registers Condition codes Stack pointer (SP) Program counter (PC)

Code and Data

read-only code/data

stackSP

PC

brk

Thread (main thread)

Kernel context: VM structures Descriptor table brk pointer

Page 31: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

A Process With Multiple Threads

Multiple threads can be associated with a process Each thread has its own logical control flow Each thread shares the same code, data, and kernel context

Share common virtual address space (inc. stacks) Each thread has its own thread id (TID)

shared libraries

run-time heap

0

read/write dataThread 1 context: Data registers Condition codes SP1 PC1

Shared code and data

read-only code/data

stack 1

Thread 1 (main thread)

Kernel context: VM structures Descriptor table brk pointer

Thread 2 context: Data registers Condition codes SP2 PC2

stack 2

Thread 2 (peer thread)

Page 32: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Logical View of Threads

Threads associated with process form a pool of peers Unlike processes which form a tree hierarchy

P0

P1

sh sh sh

foo

bar

T1

Process hierarchyThreads associated with process foo

T2T4

T5 T3

shared code, dataand kernel context

Page 33: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Thread Execution

Single Core Processor Simulate concurrency by

time slicing

Multi-Core Processor Can have true concurrency

Time

Thread A Thread B Thread C Thread A Thread B Thread C

Run 3 threads on 2 cores

Page 34: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Logical Concurrency

Two threads are (logically) concurrent if their flows overlap in time

Otherwise, they are sequential

Examples: Concurrent: A & B, A&C Sequential: B & C

Time Thread A Thread B Thread C

Page 35: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Threads vs. Processes

How threads and processes are similar Each has its own logical control flow Each can run concurrently with others (possibly on dif-

ferent cores) Each is context switched

How threads and processes are different Threads share code and some data

Processes (typically) do not Threads are somewhat less expensive than processes

Process control (creating and reaping) is twice as expensive as thread control

Linux numbers: ~20K cycles to create and reap a process ~10K cycles (or less) to create and reap a thread

Page 36: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Posix Threads (Pthreads) Interface

Pthreads: Standard interface for ~60 functions that manipu-late threads from C programs Creating and reaping threads

pthread_create() pthread_join()

Determining your thread ID pthread_self()

Terminating threads pthread_cancel() pthread_exit() exit() [terminates all threads] , RET [terminates current thread]

Synchronizing access to shared variables pthread_mutex_init pthread_mutex_[un]lock pthread_cond_init pthread_cond_[timed]wait

Page 37: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

/* thread routine */void *thread(void *vargp) { printf("Hello, world!\n"); return NULL;}

The Pthreads "hello, world" Program/* * hello.c - Pthreads "hello, world" program */#include "csapp.h"

void *thread(void *vargp);

int main() { pthread_t tid;

Pthread_create(&tid, NULL, thread, NULL); Pthread_join(tid, NULL); exit(0);}

Thread attributes (usually NULL)

Thread arguments(void *p)

return value(void **p)

Page 38: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Execution of Threaded“hello, world”

main thread

peer thread

return NULL;main thread waits for peer thread to terminate

exit() terminates

main thread and any peer threads

call Pthread_create()

call Pthread_join()

Pthread_join() re-turns

printf()

(peer threadterminates)

Pthread_create() returns

Page 39: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Thread-Based Concurrent Echo Server

Spawn new thread for each client Pass it copy of connection file descriptor Note use of Malloc()!

Without corresponding Free()

int main(int argc, char **argv) { int port = atoi(argv[1]); struct sockaddr_in clientaddr; int clientlen=sizeof(clientaddr); pthread_t tid;

int listenfd = Open_listenfd(port); while (1) {

int *connfdp = Malloc(sizeof(int));*connfdp = Accept(listenfd,

(SA *) &clientaddr, &clientlen);Pthread_create(&tid, NULL, echo_thread, connfdp);

}}

Page 40: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Thread-Based Concurrent Server (cont)

Run thread in “detached” mode Runs independently of other threads Reaped when it terminates

Free storage allocated to hold clientfd “Producer-Consumer” model

/* thread routine */void *echo_thread(void *vargp) { int connfd = *((int *)vargp); Pthread_detach(pthread_self()); Free(vargp); echo(connfd); Close(connfd); return NULL;}

Page 41: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Threaded Execution Model

Multiple threads within single process Some state between them

File descriptors

Client 1Server

Client 2Server

ListeningServer

Connection Requests

Client 1 data Client 2 data

Page 42: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Potential Form of Unintended Sharing

main thread

peer1

while (1) {int connfd = Accept(listenfd, (SA *) &clientaddr, &clientlen);Pthread_create(&tid, NULL, echo_thread, (void *) &connfd);

}}

connfd

Main thread stack

vargp

Peer1 stack

vargp

Peer2 stackpeer2

connfd = connfd1

connfd = *vargpconnfd = connfd2

connfd = *vargp

Race!

Why would both copies of vargp point to same location?

Page 43: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Could this race occur?

Race Test If no race, then each thread would get different value of i Set of saved values would consist of one copy each of 0

through 99.

int i;for (i = 0; i < 100; i++) { Pthread_create(&tid, NULL, thread, &i);}

Main

void *thread(void *vargp) { int i = *((int *)vargp); Pthread_detach(pthread_self()); save_value(i); return NULL;}

Thread

Page 44: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Experimental Results

The race can really happen!

No Race

Multicore server

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 990

1

2

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 990

2

4

6

8

10

12

14

Single core laptop

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 990

1

2

3

Page 45: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Issues With Thread-Based Servers

Must run “detached” to avoid memory leak. At any point in time, a thread is either joinable or detached. Joinable thread can be reaped and killed by other threads.

must be reaped (with pthread_join) to free memory resources. Detached thread cannot be reaped or killed by other threads.

resources are automatically reaped on termination. Default state is joinable.

use pthread_detach(pthread_self()) to make detached. Must be careful to avoid unintended sharing.

For example, passing pointer to main thread’s stack Pthread_create(&tid, NULL, thread, (void *)&connfd);

All functions called by a thread must be thread-safe Stay tuned

Page 46: PROCESS AND THREADS. Approaches to Concurrency  Processes  Hard to share resources: Easy to avoid unintended sharing  High overhead in adding/removing

Pros and Cons of Thread-Based Designs

+ Easy to share data structures between threads e.g., logging information, file cache.

+ Threads are more efficient than processes.

– Unintentional sharing can introduce subtle and hard-to-reproduce errors! The ease with which data can be shared is both the

greatest strength and the greatest weakness of threads.

Hard to know which data shared & which private Hard to detect by testing

Probability of bad race outcome very low But nonzero!

Future lectures