Cover of Introduction To Algorithms by Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, Clifford Stein - Business and Economics Book

From "Introduction To Algorithms"

Author: Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, Clifford Stein
Publisher: MIT Press
Year: 2001
Category: Computers

🎧 Free Preview Complete

You've listened to your free 10-minute preview.
Sign up free to continue listening to the full summary.

🎧 Listen to Summary

Free 10-min Preview
0:00
Speed:
10:00 free remaining
Chapter 7: VII Selected Topics
Key Insight 1 from this chapter

Dynamic Multithreaded Algorithms and Parallel Performance Analysis

Key Insight

The text introduces dynamic multithreaded algorithms as an elegant model for parallel computing on multiprocessor systems, extending traditional serial algorithms. Unlike static threading, which struggles with dynamic work partitioning and load balancing, dynamic multithreading simplifies parallel programming by allowing programmers to specify logical parallelism via concurrency keywords like 'spawn' and 'sync', with an underlying concurrency platform's scheduler automatically managing resource allocation and load balancing. This model supports nested parallelism, where a spawned subroutine can run concurrently with its parent, and parallel loops, allowing loop iterations to execute simultaneously. The serialization of a multithreaded algorithm is the equivalent serial algorithm obtained by simply removing these concurrency keywords, ensuring a direct mapping to traditional computation.

Multithreaded computations are conceptualized as directed acyclic graphs (DAGs) where vertices are strands (sequences of instructions without parallel control) and edges represent dependencies. These DAGs are embedded within a tree of procedure instances, detailing the flow of execution and parallel control. Key performance measures for multithreaded algorithms are 'work' (T1) and 'span' (T∞). Work is the total time on a single processor, equivalent to the sum of all strand execution times. Span is the longest time along any path in the computation DAG, representing the execution time on an ideal machine with unlimited processors (T∞). These metrics provide lower bounds for the running time (TP) on P processors: TP ≥ T1 / P (work law) and TP ≥ T∞ (span law).

Further performance insights are derived from 'parallelism,' defined as the ratio T1 / T∞, which indicates the average amount of parallel work per critical path step and the maximum possible speedup. 'Slackness,' (T1 / T∞) / P, quantifies how much the computation's parallelism exceeds the number of available processors. A greedy scheduler, which assigns as many ready strands to processors as possible in each time step, is proven to execute a multithreaded computation in time TP ≤ T1 / P + T∞. This bound demonstrates that greedy schedulers are within a factor of 2 of optimal performance. Crucially, if the slackness is high (e.g., P << T1 / T∞, typically a factor of 10), a greedy scheduler achieves near-perfect linear speedup, meaning TP ≈ T1 / P. An example is the P-FIB(n) algorithm, where work is Θ(Φ^n) and span is Θ(n), leading to a parallelism of Θ(Φ^n / n) that grows dramatically with 'n'.

📚 Continue Your Learning Journey — No Payment Required

Access the complete Introduction To Algorithms summary with audio narration, key takeaways, and actionable insights from Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, Clifford Stein.