Randomized Algorithms
Algorithms that use random choices during execution, analyzed using probability theory. Rather than viewing randomness as noise, these algorithms harness it as a powerful computational resource—often yielding simpler, faster, or more elegant solutions than deterministic approaches. Includes both Las Vegas algorithms (always correct, random runtime) and Monte Carlo algorithms (fast, probably correct).
Intermediate to Advanced
- Basic concepts (randomized quicksort, hashing): Intermediate
- Probabilistic analysis (expectations, tail bounds): Intermediate to Advanced
- Advanced techniques (Markov chains, derandomization): Advanced
- Research frontiers (streaming, sketching): Advanced
Randomness as algorithmic power: Randomization isn’t a weakness—it’s a fundamental computational resource. Often, the simplest solution to a problem involves making random choices. This counterintuitive insight transforms how we approach algorithm design: instead of fighting uncertainty, we embrace it.
War Story - Karger’s Min-Cut: The global minimum cut problem asks: what’s the smallest set of edges whose removal disconnects a graph? For decades, this required complex deterministic algorithms. Then Karger discovered something remarkable: randomly contract edges, and you’ll find the min-cut with reasonable probability. Repeat a few dozen times, take the best answer. The algorithm is elegant, simple to implement, and rivals sophisticated deterministic approaches.
War Story - Universal Hashing: Early hash tables suffered from adversarial inputs that caused worst-case collisions. Universal hashing solved this permanently: randomly choose your hash function from a family at runtime. No adversary can craft bad inputs because they don’t know which function you picked. This single idea made hash tables provably fast, transforming them from a practical heuristic to a rigorous data structure. Every modern hash table implementation uses this principle.
- Randomized Quicksort - Expected O(n log n) via random pivot selection
- Universal Hashing - Randomized hash functions with provable guarantees
- Miller-Rabin Primality Testing - Fast probabilistic primality checking
- Karger’s Min-Cut - Elegant randomized contraction algorithm
- Randomized Selection - Expected linear-time median finding
- Schwartz-Zippel Lemma - Polynomial identity testing via random evaluation
- Power of Two Choices - Load balancing with exponential improvement
- Randomized Rounding for LP - Approximate optimization via probabilistic rounding
- Skip Lists - Randomized alternative to balanced trees
- Treaps - Randomized binary search trees
- Bloom Filters - Space-efficient probabilistic set membership
Required:
- Probability theory (expectations, independence, conditional probability)
- Basic algorithms (sorting, searching, graph traversal)
- Big-O notation and asymptotic analysis
Helpful:
- Tail bounds (Markov, Chebyshev, Chernoff)
- Random variables and distributions
- Basic understanding of hash tables and BSTs
Setup: You’re hiring for a position. Candidates arrive one at a time, and you must decide immediately whether to hire each one (firing the current employee if needed). You want to hire the best candidate but minimize the number of hirings (each costs money). How do you decide?
Naive approach: Hire everyone better than your current employee. But this could be expensive if candidates arrive in increasing quality order.
Randomized insight: What if candidates arrive in random order? Suddenly the analysis becomes tractable. The expected number of hirings is only O(log n) because each successive candidate is the best so far with probability 1/k (where k is their position).
The revelation: By assuming randomness in the input or by randomly permuting the input ourselves, we transform an adversarial worst-case analysis into a clean expected-case analysis.
Generalization: This teaches the fundamental technique of randomized algorithms—assume random order or randomly permute, then analyze expected behavior.
Setup: In a room of 23 people, there’s a 50% chance two share a birthday. This counterintuitive result reveals deep truths about collisions in hash tables.
Naive thinking: With 365 days, you’d need hundreds of people for collisions to be likely.
Probabilistic insight: The number of pairs grows quadratically (23 people = 253 pairs). Even with a large space, collisions occur surprisingly quickly.
Application to hashing: With n items and n buckets, collisions are guaranteed by the birthday paradox. Understanding this leads to:
- Chaining to handle collisions
- Universal hashing to prevent adversarial inputs
- Load factor analysis
- The power of two choices (exponential improvement!)
The lesson: Probabilistic reasoning reveals non-obvious truths about algorithm behavior, enabling both better design and provable guarantees.
Setup: Testing if a large number (thousands of digits) is prime is crucial for cryptography. Deterministic tests are slow. How can we do better?
Deterministic approach: Trial division or AKS algorithm—provably correct but slow in practice.
Randomized approach: Pick random “witnesses” and test Fermat’s condition. If any witness proves compositeness, reject. Otherwise, probably prime.
The trade-off: Run Miller-Rabin k times. Probability of error is 1/4^k. With k=50, error probability is less than 2^-100—effectively zero. The algorithm is blazingly fast.
The philosophical shift: We trade absolute certainty for overwhelming confidence plus enormous speed gains. In practice, cosmic rays are more likely to corrupt your computer’s memory than Miller-Rabin is to give a wrong answer.
Optimal stopping and the secretary problem: You’re interviewing candidates for a job. You must accept or reject each one immediately after the interview, with no callbacks. How do you maximize your chance of hiring the best?
The optimal strategy: reject the first 37% of candidates (approximately n/e), then hire the first person better than all previous ones. This surprisingly simple rule succeeds with 37% probability—the best possible.
Application: This models real-life decisions—apartment hunting, choosing a life partner (per Algorithms to Live By), or even parking spot selection. The 37% rule appears repeatedly in optimal stopping problems.
Load balancing in daily life: The “power of two choices” shows that checking just two options (instead of one random choice) exponentially reduces maximum load. This applies to:
- Choosing checkout lines at the grocery store (glance at two, pick shorter)
- Selecting parking spots (try two areas, pick less crowded)
- Distributing tasks among team members (ask two people, assign to less busy)
The linearity of expectation is the randomized algorithm analyst’s best friend:
- E[X + Y] = E[X] + E[Y] (even if X and Y are dependent!)
- Enables analysis by breaking algorithms into independent random choices
Why students care: Many seemingly intractable analyses become trivial using linearity of expectation.
Moving beyond expectations to probability of deviation:
Markov’s Inequality: P(X ≥ a) ≤ E[X]/a (for non-negative X)
- Crude but universally applicable
Chebyshev’s Inequality: P(|X - E[X]| ≥ a) ≤ Var(X)/a²
- Uses variance for tighter bounds
Chernoff Bounds: For sums of independent random variables
- Exponential tail decay: P(X ≥ (1+δ)μ) ≤ e^(-μδ²/3)
- The gold standard for concentration
Why these matter: Tail bounds let us say “with high probability” rigorously, moving from expected behavior to guaranteed behavior with overwhelming likelihood.
Prove existence by showing positive probability:
- If a random object has property P with probability > 0, then such objects exist
- Often easier than explicit construction
- Pioneered by Paul Erdős in combinatorics
- Motwani & Raghavan “Randomized Algorithms” - The definitive reference
- Mitzenmacher & Upfal “Probability and Computing” - Excellent pedagogical approach
- Erickson Chapter 7 - Clear, intuitive exposition
- CLRS Ch 5 - Probabilistic analysis foundations
- Kleinberg & Tardos §13 - Problem-motivated introduction
- MIT 6.856J/18.416J - Dedicated randomized algorithms course
- Stanford CS265/CME309 - Flipped classroom approach (Dan Spielman)
- Harvard CS 223 - Michael Mitzenmacher’s course (author of textbook)
- CMU 15-859(M) - Advanced topics in randomization
- Karger “Minimum Cuts in Near-Linear Time” (JACM 2000)
- Rabin “Probabilistic Algorithm for Testing Primality” (1980)
- Carter & Wegman “Universal Classes of Hash Functions” (1979)
- Divide-and-Conquer: Quicksort uses randomization for expected performance
- Graph Algorithms: Randomized min-cut, random walk methods
- Streaming Algorithms: Sketching techniques rely fundamentally on randomization
- Approximation Algorithms: Randomized rounding for LP relaxations
- Cryptography: Primality testing, key generation, zero-knowledge proofs
- Data Structures: Hash tables, skip lists, treaps, Bloom filters
- Machine Learning: Stochastic gradient descent, random projections
- Derandomization: Using randomness to design algorithms, then removing it
Randomized Algorithm Visualizer
- Visualize quicksort pivot selection and performance
- Animate Karger’s min-cut contraction
- Show collision patterns in different hash functions
- Compare randomized vs deterministic algorithm performance
Hash Function Battle Royale
- Implement various hash functions (simple mod, universal, cryptographic)
- Test against adversarial inputs
- Measure collision rates, distribution quality
- Demonstrate why universal hashing defeats adversarial inputs
Probabilistic Prime Generator
- Implement Miller-Rabin primality testing
- Generate large random primes for RSA
- Compare speed vs deterministic methods
- Analyze error probability vs number of iterations
Bloom Filter Applications
- Build spell-checker using Bloom filters
- Implement web cache with probabilistic membership
- Measure space-accuracy tradeoffs
- Compare to exact data structures
Load Balancing Simulator
- Simulate balls-and-bins with different strategies
- Visualize “power of two choices” exponential improvement
- Apply to real-world scenarios (server load, task distribution)
- Measure maximum load distributions
Secretary Problem Interactive
- Build interactive secretary problem simulator
- Let users try different strategies
- Show optimal 37% rule performance
- Extend to variants (known distributions, multiple positions)
2-4 lectures depending on depth
- 2 lectures: Core concepts, quicksort, hashing, min-cut, probabilistic analysis
- 3 lectures: + Primality testing, randomized rounding, Chernoff bounds
- 4 lectures: + Advanced topics (Markov chains, derandomization, streaming)
“Randomized algorithms are unreliable”
- False! With proper analysis, they can be more reliable than deterministic algorithms on adversarial inputs. Error probabilities can be made astronomically small (2^-100).
“Random means no guarantees”
- False! Randomized algorithms come with rigorous probability guarantees. “With high probability” is a precise mathematical statement.
“Randomized algorithms are always faster”
- False! Randomization is a tool, not magic. Sometimes deterministic algorithms are faster or simpler.
“Expected case = average case”
- Subtle distinction! Expected case is over the algorithm’s random choices (we control). Average case is over input distribution (external assumption).
“You need true randomness”
- False practically! Pseudorandom generators suffice for most applications. True randomness mainly matters for cryptography.
“Randomized algorithms are a modern invention”
- False! Monte Carlo methods date to the 1940s. Randomized primality testing emerged in the 1970s. The field has deep roots.
Traditional (banking model):
“Here’s the definition of Las Vegas vs Monte Carlo algorithms. Here’s the probability theory you’ll need. Memorize these bounds. Now here’s randomized quicksort.”
Problem-posing approach:
“You need to sort a million numbers, but someone might give you already-sorted input that makes your algorithm slow. How can you protect against that? What if you randomly shuffled first? What if you picked a random pivot? Let’s analyze what happens… Hey, now the worst case is gone! We’ve traded deterministic worst-case for probabilistic expected-case.”
The difference: Students discover why randomization helps instead of being told that it does. They see randomness emerge as a solution to adversarial inputs, not as an abstract technique.
Curriculum placement:
- Week 4-6 for graduate courses (after divide-and-conquer, before or after dynamic programming)
- Can be integrated throughout (randomized quicksort early, advanced topics later)
Prerequisites check: Students need solid probability background. If weak, need probability refresher first—covering expectations, independence, basic distributions.
Why mid-course: Builds on foundational techniques (divide-and-conquer, basic analysis) while introducing probabilistic thinking needed for advanced topics (streaming, approximation).
Streaming and Sketching: Randomized algorithms have revolutionized big data processing. Sketching techniques (Count-Min Sketch, HyperLogLog) enable processing data streams using sublinear memory—crucial for modern systems processing billions of events.
Derandomization: Can we remove randomness while preserving efficiency? This deep question connects to circuit complexity and pseudorandom generators. Progress here would have profound implications for complexity theory.
Quantum Algorithms: Quantum computation is fundamentally probabilistic. Understanding randomized algorithms provides intuition for quantum algorithms and their analysis.
Machine Learning: Stochastic gradient descent, random projections, and randomized dimensionality reduction power modern ML. The “unreasonable effectiveness” of randomization in deep learning remains an active research area.
Ready for students? ✓ Yes - comprehensive coverage from basics to research frontiers, strong motivation via real-world applications, multiple entry points for different backgrounds.
