Synchronization and Mutual Exclusion
Atomic Operations: Either complete or not.
Many instructions are not atomic: double-precision floating point store often not atomic.
Synchronization: using atomic operations to ensure cooperation between threads.
Mutual Exclusion: ensuring that only one thread does a particular thing at a time.
Race Condition: Two threads attempting to access same data simultaneously with one of them performing a write.
Process-level Approaches
The two algorithms that will be introduced in this sections do not rely on programming language and operating system supports.
Dekker's Algorithm
Process 0 checks the intention of the other process 1.
If 1 also wants to enter the critical section:
0 checks the turn variable. If it's 1's turn, then 0 recalls its desiration and waits (spins) until the turn changes. If it's not 1's turn, then 1 may decide to withdraw its request to enter the critical section, allowing 0 to proceed.
Once inside the critical section, the process will eventually exit, and the turn is given to the other process, ensuring fairness.
Peterson's Algorithm
Hardware-level Approaches
Two commly implemented instructions:
Test and Set Instruction
Exchange Instruction
We use the atomic_flag_test_and_set
instruction here to implement a spin lock. The test&set instruction reads the value from a memory location (bolt
in this case) and sets it to 1 in an atomic manner.
In x86, the test&set operation can be implemented using XCHG, which is an atomic instruction that performs an exchange.
Semaphore
Semaphore is a variable that has an integer value, with three operations defined upon it:
Init
Down
orP
orWait
: an atomic operation that waits for semaphore to become positive, then decrements it by 1Up
orV
orSignal
orPost
: an atomic operation that increments the semaphore by 1, waking up a waiting P, if any.
Binary semaphores are considered as a special type of semaphore in COMP130110.03. When we refer to semaphores, we actually mean counting semaphores instead of binary semaphores.
Strong semaphore: the process that has been blocked the longest is released from the queue first.
Weak semaphore: doesn't specify the order in which processes are removed from the queue.
The encapsulated variable val
of semaphores:
val >= 0
number of processes that can execute waits without blockingval < 0
the magnitude ofval
is the number of processes blocked in the queue
Producer/Consumer Problem
Producer/consumer problem with finite buffer can be solved by two semaphores.
NOTICE (COMP130110Final@FDU)
It's OK to switch the two lines:
It will result in deadlocks if we switch the two lines:
Solving by Condition Variables
Discussions:
Why we need two condition variables? In cases with multiple producers and consumers, using a single condition variable can result in a consumer waking up another consumer, which is not desired.
Why we use
while
loops instead ofif
s? Thesignal
function only wakes up a thread, putting it in the ready queue. However, it is highly possible that the world has changed between the thread waking up and being executed. Usingwhile
allows for the ability to double-check the condition. Also for spurious wakeups.
Barber Shop Problem
A customer will not enter the shop if it is filled to the capacity with other customers.
Once inside, the customer takes a seat on the sofa or stands if the sofa is filled.
When a barber is free, the customer that has been on the sofa the longest is served.
When a customer's haircut is finished, payment is accepted for one customer at a time.
Readers/Writers Problem
Problem: Readers have privilege here. Readers can enter the critical section without blocking even if there are a great amount of writers waiting on wsem
. Writers are subject to starvation.
Optimized Solution: add a writecount
to writer
.
This solves the issue that writers may starve. However, in this case, readers may also starve. Nevertheless, it's still a better idea compared to the initial solution.
Implementation of Semaphores
Disabling Interrupts: Feasible for uniprocessor system. For multiprocessor system, it's not a fancy idea to disable interrupts for all cores.
Software algorithms: Dekker and Peterson
Semaphore Implementation by test&set instruction: The semaphore now includes a new integer component
s.flag
which controls access to the semaphore.
Locks
Lock prevents someone from doing something. Threads should wait if a region is locked and should sleep if waiting for a long time.
Difference between locks and semaphores:
Count: A lock (mutex) is binary; it is either locked or unlocked. On the other hand, a semaphore maintains a count and allows more than one thread to access a resource, up to a limit specified by the count.
Ownership: A lock must be released by the thread that acquired it, while a semaphore can be posted by any thread.
Use cases: Locks are typically used when a piece of code or resource should only be accessed by one thread at a time (critical section). Semaphores are used when a number of identical resources are available, or to control access to a pool of resources.
Implementation of locks
Hardware is responsible for implementing this correctly. It is effective on both uniprocessors and multiprocessors.
Implementing Locks with test&set: We can implement a lock without needing to enter kernel mode (disabling interrupts) using test&set.
Even though this is busy waiting, we don't need to enter kernel mode and it works well with multiprocessor.
Negatives:
This is very inefficient as thread will consume cycles waiting.
Waiting thread may take cycles away from thread holding lock (no one wins).
Priority Inversion: If busy-waiting thread has higher priority than thread holding lock. It simply waits instead of preempting.
Waiting thread may wait for an arbitrary long time.
Cache coherency overhead: In a multiprocessor system, the flag variable is likely to be stored in the cache of each processor. If one processor changes the flag, all other caches must invalidate their copies of the flag (Even if the origin value is 1 and we write 1 to the variable). This constant invalidation and refreshing of cache entries leads to a significant overhead, further hampering the system's performance.
An Improvement for the Ping-pong Issue: test&test&set
We wait until the lock might be free.
Compared with the above implementation that everytime a test&set is called, where the cache has to be refreshed, we only do reading here, the lock variable stays in cache.
We try to grab the lock with test&set
We repeat if we fail to actually get the lock
Last updated