Greedy Algorithms
Algorithms that construct solutions by making locally optimal choices at each step, hoping (and sometimes proving) they lead to globally optimal solutions. The essence of greedy algorithms is myopic optimization—making the best immediate choice without reconsidering past decisions. The art lies in knowing when this simple strategy actually works.
Introductory to Advanced
- Basic greedy algorithms (activity selection, Huffman): Introductory
- Greedy correctness proofs (exchange arguments): Intermediate
- Matroid theory: Advanced
- Greedy approximation algorithms: Intermediate to Advanced
The strategy that powers data compression: Greedy algorithms represent a fundamental decision-making pattern: make the best choice now and never look back. This maps to:
- Real-time systems (decisions must be made immediately)
- Heuristic design (when optimality is too expensive, greedy often works well)
- Human intuition (people naturally think greedily first)
War Story 1 - Huffman Coding Saved the Internet: Huffman coding (1952) uses a greedy strategy to build optimal prefix-free codes for data compression. This single algorithm:
- Underpins ZIP, GZIP, and 7-Zip compression
- Powers JPEG image compression (combined with DCT)
- Enables MP3 audio compression
- Saves billions of gigabytes daily in data transmission
The insight: always combine the two least-frequent symbols. This simple greedy choice provably yields optimal compression. Without Huffman coding, the internet would require vastly more bandwidth—potentially slowing the entire digital revolution.
War Story 2 - Scheduling Saves Lives: Hospitals use greedy algorithms for operating room scheduling. The interval scheduling problem—maximize number of non-overlapping surgeries—has a simple greedy solution: always schedule the surgery that ends earliest.
This algorithm:
- Maximizes utilization of expensive OR time
- Minimizes patient wait times
- Scales to thousands of appointments
- Provably optimal (not just a heuristic!)
- Activity Selection / Interval Scheduling - Maximize non-overlapping intervals
- Huffman Coding - Optimal prefix-free codes for data compression
- Minimum Spanning Tree (MST) - Kruskal’s and Prim’s algorithms
- Dijkstra’s Shortest Path - Single-source shortest paths in non-negative graphs
- Fractional Knapsack - Greedy works when items are divisible
- Coin Change (special cases) - Greedy works for certain coin systems (US coins)
- Job Sequencing with Deadlines - Maximize profit with time constraints
- Set Cover Greedy Approximation - O(log n) approximation for NP-hard problem
- Vertex Cover 2-Approximation - Simple greedy matching-based approach
- Matroid Greedy Algorithm - General framework explaining when greedy works
- Submodular Function Maximization - (1 - 1/e) approximation guarantee
- Online Algorithms - Competitive analysis of greedy online strategies
- Metric TSP Approximation - MST-based greedy heuristic
Required:
- Basic algorithm analysis (Big-O notation)
- Understanding of optimization problems
- Graph theory basics (for MST, shortest paths)
- Proof techniques (especially contradiction and exchange arguments)
Helpful:
- Dynamic programming (for contrast—when greedy fails, DP often succeeds)
- Priority queues / heaps (implementation of greedy algorithms)
- Basic probability (for randomized greedy variants)
Setup: You manage a conference room. Ten groups want to book it:
- Group A: 9am-10am
- Group B: 9:30am-11am
- Group C: 10am-11:30am
- Group D: 10:30am-12pm
- …and so on
Goal: Maximize the number of groups that can use the room (no overlaps allowed).
Naive approaches students try:
- “Choose shortest meetings first” — Counterexample: One 1-minute meeting at noon blocks two 3-hour meetings (morning and afternoon)
- “Choose earliest starting times” — Counterexample: 8am-5pm blocks everything
- “Choose meetings with fewest conflicts” — Requires checking all combinations
Guided discovery:
- “What if we commit to one meeting—which should it be?”
- “If we pick a meeting, what remains is a smaller instance of the same problem”
- “Which choice leaves the most ‘room’ for future meetings?”
The breakthrough: “Always pick the meeting that ends earliest!”
- Intuition: Finishing early leaves maximum time for subsequent meetings
- Can prove optimality via exchange argument (if optimal solution differs, we can swap to our greedy choice without losing optimality)
The recurrence:
GreedySchedule(meetings):
Sort by end time
Select first meeting
Remove all conflicting meetings
Recurse on remaining meetings
Why this matters: This pattern—greedy choice + optimal substructure—characterizes all greedy algorithms. The proof technique (exchange argument) is the key skill.
Setup: You’re transmitting text over a slow connection. Frequent letters should use short codes, rare letters can use long codes. How do you design optimal codes?
Constraints:
- Codes must be prefix-free (no code is prefix of another)
- Need to decode unambiguously
Discovery path:
- “Frequent letters should get short codes”
- “How do we ensure prefix-free property?”
- “Binary tree: left=0, right=1, leaves=letters”
- “What if we build tree bottom-up, combining rarest letters first?”
The greedy insight:
- Count letter frequencies
- Build tree by repeatedly combining two least-frequent symbols
- This minimizes expected code length!
The revelation: Huffman’s algorithm is greedy—always combine the two rarest symbols. Proves optimal via exchange argument.
Real-world impact:
- English text: ~8 bits/char (ASCII) → ~4.5 bits/char (Huffman)
- 40-50% compression on typical text
- Combined with other techniques: ZIP achieves 70-90% compression
Why students care: They’ve used ZIP files and wondered how compression works. Now they know: elegantly simple greedy algorithm with deep mathematical justification.
Setup: You need to connect n cities with fiber optic cable. Each possible connection has a cost. Goal: connect all cities with minimum total cost.
Key insight: Need spanning tree (all cities connected, no cycles).
Greedy strategies that work:
- Kruskal’s: Sort edges by weight, add cheapest edge that doesn’t create cycle
- Prim’s: Start with arbitrary city, always add cheapest edge to a new city
Why do these work?
- Cut property: minimum weight edge crossing any cut is in some MST
- Greedy choice is always safe
The counterintuitive part: Locally optimal (cheapest edge) actually guarantees globally optimal (minimum total weight)!
Real applications:
- Network design (internet backbone)
- VLSI circuit design
- Cluster analysis (single-linkage clustering is MST-based)
- Approximation algorithms (TSP, Steiner tree)
When greedy thinking works in life:
- “Earliest deadline first” for task prioritization (often optimal!)
- “Always take the closest parking spot” (optimal for minimizing walking)
- “Merge lanes by taking turns” (zipper merge is greedy and optimal)
When greedy thinking fails:
- Career planning: greedy salary maximization might miss better long-term paths
- Route planning: locally shortest turns might yield longer overall route
- Investment: always chasing highest returns leads to excessive risk
The lesson: Recognizing when to think greedily vs. when to plan ahead (dynamic programming) is a fundamental life skill. Algorithm design formalizes this intuition.
- Kleinberg & Tardos §4 - Excellent greedy algorithm design patterns and proof techniques
- Skiena §10.1-10.3 - War stories: when greedy works and when it fails spectacularly
- CLRS Ch 16 - Comprehensive treatment including matroid theory
- Erickson Chapter 6 - Practical approach with emphasis on exchange arguments
- DPV §5 - Concise treatment focusing on key examples
- Korte & Vygen “Combinatorial Optimization” - Matroid theory in depth
- Vazirani “Approximation Algorithms” §3 - Greedy approximations including Set Cover
- Tim Roughgarden “Algorithms Illuminated” - Part 3 includes greedy algorithms with detailed proofs
- MIT 6.046J - Greedy algorithms and exchange arguments
- Stanford CS161 - Problem-motivated greedy design
- Huffman (1952) - Original paper introducing Huffman coding
- Edmonds (1971) - Matroid optimization (fundamental theory)
- Nemhauser, Wolsey, Fisher (1978) - Submodular function maximization
- Dynamic Programming: When greedy fails due to lack of optimal substructure, DP often succeeds
- Approximation Algorithms: Greedy provides constant-factor approximations for many NP-hard problems (Set Cover, Vertex Cover, Submodular Maximization)
- Online Algorithms: Many online algorithms are inherently greedy (must decide immediately)
- Matroid Theory: Provides the mathematical framework explaining when greedy algorithms work
- Graph Algorithms: MST and shortest paths are fundamental greedy algorithms
- Machine Learning: Greedy feature selection, decision tree construction
Compression Toolkit
- Implement Huffman coding from scratch
- Compare compression ratios on different file types (text, code, random data)
- Visualize Huffman tree construction
- Explore adaptive Huffman (updating codes as data streams)
Scheduling Optimizer
- Implement interval scheduling and variants (weighted intervals, resource constraints)
- Build visualization showing greedy choices
- Compare greedy to optimal DP solution for weighted case
- Apply to real calendar data
Network Design Simulator
- Implement Kruskal’s and Prim’s MST algorithms
- Visualize algorithm execution on random graphs
- Compare performance on different graph structures
- Extend to Steiner tree approximation
Greedy vs. Optimal Explorer
- Build interactive tool showing problems where greedy works vs. fails
- Coin change: US coins (greedy works) vs. custom systems (greedy fails)
- Shortest path: non-negative (Dijkstra works) vs. negative weights (Dijkstra fails)
- Knapsack: fractional (greedy works) vs. 0/1 (greedy fails)
Set Cover Approximation
- Implement greedy Set Cover algorithm
- Measure approximation ratio on random instances
- Compare to integer programming solution
- Apply to real-world covering problems (test suite selection, sensor placement)
2-3 lectures depending on depth
- 1.5 lectures: Core greedy algorithms (activity selection, Huffman, MST, Dijkstra)
- 2 lectures: + Exchange argument proofs, greedy approximation algorithms
- 3 lectures: + Matroid theory, advanced greedy techniques, submodular optimization
“Greedy algorithms are just heuristics”
- False! Many greedy algorithms are provably optimal (MST, Dijkstra, Huffman). Others provide approximation guarantees.
“If greedy is optimal for one formulation, it’s always optimal”
- False! Fractional knapsack: greedy optimal. 0/1 knapsack: greedy fails. Coin change: depends on coin system.
“Greedy is always faster than dynamic programming”
- Usually true, but not always. Both can be O(n log n) depending on implementation. Advantage of greedy is simplicity, not just speed.
“Exchange arguments are the only way to prove greedy correctness”
- False! Can also use induction, cut property (for graphs), or matroid theory. Exchange arguments are most common because they’re intuitive.
“Greedy algorithms don’t need fancy data structures”
- False! Efficient implementation often requires priority queues (Dijkstra, Prim’s), Union-Find (Kruskal’s), etc.
Traditional (banking model):
“A greedy algorithm makes the locally optimal choice at each stage. Here’s the activity selection algorithm: sort by end time, select non-conflicting meetings. Memorize this. Here’s the proof by exchange argument. Apply to homework problems.”
Problem-posing approach:
“You manage a conference room with conflicting meeting requests. How do you maximize usage? What strategy feels right? Let’s try ’earliest start time’—here’s a counterexample. What about ‘shortest duration’—another counterexample. What if we think about which choice leaves the most flexibility? Ah! ‘Earliest end time’ makes sense—let’s see if we can prove it works…”
The difference: Students discover greedy strategies by exploring what works and what fails. They develop intuition for when to try greedy approaches and how to prove correctness. The exchange argument emerges as a natural way to formalize the intuition that “if optimal solution differs, we can swap to our greedy choice.”
Curriculum placement:
- Week 2-3 for graduate courses (after divide-and-conquer, alongside or before DP)
- Week 3-4 for advanced undergraduate
Prerequisites check: Students should understand:
- Basic algorithm analysis
- Proof by contradiction and induction
- Graph terminology (for MST and shortest paths)
Why early: Greedy algorithms are conceptually simple and provide excellent practice for proof techniques (especially exchange arguments). They also motivate dynamic programming—when students see greedy fail (0/1 knapsack, weighted interval scheduling), they ask “what technique handles this?” and DP emerges as the answer.
Pedagogical note: Start with activity selection (clean example with visual intuition). Then Huffman (more complex but compelling application). Then MST (connects to graphs, has multiple valid greedy strategies). Prove at least one algorithm rigorously with exchange argument so students see the technique.
Ready for students? ✓ Yes - strong motivation through real-world applications, clear progression from simple to complex, emphasis on proof techniques, contrasts with DP to clarify when each approach applies.
