Greedy Algorithms

Overview

Algorithms that construct solutions by making locally optimal choices at each step, hoping (and sometimes proving) they lead to globally optimal solutions. The essence of greedy algorithms is myopic optimization—making the best immediate choice without reconsidering past decisions. The art lies in knowing when this simple strategy actually works.

Difficulty Level

Introductory to Advanced

Basic greedy algorithms (activity selection, Huffman): Introductory
Greedy correctness proofs (exchange arguments): Intermediate
Matroid theory: Advanced
Greedy approximation algorithms: Intermediate to Advanced

Why This Matters

The strategy that powers data compression: Greedy algorithms represent a fundamental decision-making pattern: make the best choice now and never look back. This maps to:

Real-time systems (decisions must be made immediately)
Heuristic design (when optimality is too expensive, greedy often works well)
Human intuition (people naturally think greedily first)

War Story 1 - Huffman Coding Saved the Internet: Huffman coding (1952) uses a greedy strategy to build optimal prefix-free codes for data compression. This single algorithm:

Underpins ZIP, GZIP, and 7-Zip compression
Powers JPEG image compression (combined with DCT)
Enables MP3 audio compression
Saves billions of gigabytes daily in data transmission

The insight: always combine the two least-frequent symbols. This simple greedy choice provably yields optimal compression. Without Huffman coding, the internet would require vastly more bandwidth—potentially slowing the entire digital revolution.

War Story 2 - Scheduling Saves Lives: Hospitals use greedy algorithms for operating room scheduling. The interval scheduling problem—maximize number of non-overlapping surgeries—has a simple greedy solution: always schedule the surgery that ends earliest.

This algorithm:

Maximizes utilization of expensive OR time
Minimizes patient wait times
Scales to thousands of appointments
Provably optimal (not just a heuristic!)

Core Algorithms

Essential

Activity Selection / Interval Scheduling - Maximize non-overlapping intervals
Huffman Coding - Optimal prefix-free codes for data compression
Minimum Spanning Tree (MST) - Kruskal’s and Prim’s algorithms
Dijkstra’s Shortest Path - Single-source shortest paths in non-negative graphs
Fractional Knapsack - Greedy works when items are divisible

Intermediate

Coin Change (special cases) - Greedy works for certain coin systems (US coins)
Job Sequencing with Deadlines - Maximize profit with time constraints
Set Cover Greedy Approximation - O(log n) approximation for NP-hard problem
Vertex Cover 2-Approximation - Simple greedy matching-based approach

Advanced

Matroid Greedy Algorithm - General framework explaining when greedy works
Submodular Function Maximization - (1 - 1/e) approximation guarantee
Online Algorithms - Competitive analysis of greedy online strategies
Metric TSP Approximation - MST-based greedy heuristic

Prerequisites

Required:
- Basic algorithm analysis (Big-O notation)
- Understanding of optimization problems
- Graph theory basics (for MST, shortest paths)
- Proof techniques (especially contradiction and exchange arguments)
Helpful:
- Dynamic programming (for contrast—when greedy fails, DP often succeeds)
- Priority queues / heaps (implementation of greedy algorithms)
- Basic probability (for randomized greedy variants)

Problem-Posing Approaches

Story 1: The Conference Room Booking

Setup: You manage a conference room. Ten groups want to book it:

Group A: 9am-10am
Group B: 9:30am-11am
Group C: 10am-11:30am
Group D: 10:30am-12pm
…and so on

Goal: Maximize the number of groups that can use the room (no overlaps allowed).

Naive approaches students try:

“Choose shortest meetings first” — Counterexample: One 1-minute meeting at noon blocks two 3-hour meetings (morning and afternoon)
“Choose earliest starting times” — Counterexample: 8am-5pm blocks everything
“Choose meetings with fewest conflicts” — Requires checking all combinations

Guided discovery:

“What if we commit to one meeting—which should it be?”
“If we pick a meeting, what remains is a smaller instance of the same problem”
“Which choice leaves the most ‘room’ for future meetings?”

The breakthrough: “Always pick the meeting that ends earliest!”

Intuition: Finishing early leaves maximum time for subsequent meetings
Can prove optimality via exchange argument (if optimal solution differs, we can swap to our greedy choice without losing optimality)

The recurrence:

GreedySchedule(meetings):
    Sort by end time
    Select first meeting
    Remove all conflicting meetings
    Recurse on remaining meetings

Why this matters: This pattern—greedy choice + optimal substructure—characterizes all greedy algorithms. The proof technique (exchange argument) is the key skill.

Story 2: The Data Compression Challenge

Setup: You’re transmitting text over a slow connection. Frequent letters should use short codes, rare letters can use long codes. How do you design optimal codes?

Constraints:

Codes must be prefix-free (no code is prefix of another)
Need to decode unambiguously

Discovery path:

“Frequent letters should get short codes”
“How do we ensure prefix-free property?”
“Binary tree: left=0, right=1, leaves=letters”
“What if we build tree bottom-up, combining rarest letters first?”

The greedy insight:

Count letter frequencies
Build tree by repeatedly combining two least-frequent symbols
This minimizes expected code length!

The revelation: Huffman’s algorithm is greedy—always combine the two rarest symbols. Proves optimal via exchange argument.

Real-world impact:

English text: ~8 bits/char (ASCII) → ~4.5 bits/char (Huffman)
40-50% compression on typical text
Combined with other techniques: ZIP achieves 70-90% compression

Why students care: They’ve used ZIP files and wondered how compression works. Now they know: elegantly simple greedy algorithm with deep mathematical justification.

Story 3: Connecting Cities with Minimum Cable

Setup: You need to connect n cities with fiber optic cable. Each possible connection has a cost. Goal: connect all cities with minimum total cost.

Key insight: Need spanning tree (all cities connected, no cycles).

Greedy strategies that work:

Kruskal’s: Sort edges by weight, add cheapest edge that doesn’t create cycle
Prim’s: Start with arbitrary city, always add cheapest edge to a new city

Why do these work?

Cut property: minimum weight edge crossing any cut is in some MST
Greedy choice is always safe

The counterintuitive part: Locally optimal (cheapest edge) actually guarantees globally optimal (minimum total weight)!

Real applications:

Network design (internet backbone)
VLSI circuit design
Cluster analysis (single-linkage clustering is MST-based)
Approximation algorithms (TSP, Steiner tree)

Cognitive Applications (from Algorithms to Live By)

When greedy thinking works in life:

“Earliest deadline first” for task prioritization (often optimal!)
“Always take the closest parking spot” (optimal for minimizing walking)
“Merge lanes by taking turns” (zipper merge is greedy and optimal)

When greedy thinking fails:

Career planning: greedy salary maximization might miss better long-term paths
Route planning: locally shortest turns might yield longer overall route
Investment: always chasing highest returns leads to excessive risk

The lesson: Recognizing when to think greedily vs. when to plan ahead (dynamic programming) is a fundamental life skill. Algorithm design formalizes this intuition.

Key Resources

Textbooks

Kleinberg & Tardos §4 - Excellent greedy algorithm design patterns and proof techniques
Skiena §10.1-10.3 - War stories: when greedy works and when it fails spectacularly
CLRS Ch 16 - Comprehensive treatment including matroid theory
Erickson Chapter 6 - Practical approach with emphasis on exchange arguments
DPV §5 - Concise treatment focusing on key examples

Specialized Resources

Korte & Vygen “Combinatorial Optimization” - Matroid theory in depth
Vazirani “Approximation Algorithms” §3 - Greedy approximations including Set Cover

Videos & Lectures

Tim Roughgarden “Algorithms Illuminated” - Part 3 includes greedy algorithms with detailed proofs
MIT 6.046J - Greedy algorithms and exchange arguments
Stanford CS161 - Problem-motivated greedy design

Papers & Advanced

Huffman (1952) - Original paper introducing Huffman coding
Edmonds (1971) - Matroid optimization (fundamental theory)
Nemhauser, Wolsey, Fisher (1978) - Submodular function maximization

Connections to Other Topics

Dynamic Programming: When greedy fails due to lack of optimal substructure, DP often succeeds
Approximation Algorithms: Greedy provides constant-factor approximations for many NP-hard problems (Set Cover, Vertex Cover, Submodular Maximization)
Online Algorithms: Many online algorithms are inherently greedy (must decide immediately)
Matroid Theory: Provides the mathematical framework explaining when greedy algorithms work
Graph Algorithms: MST and shortest paths are fundamental greedy algorithms
Machine Learning: Greedy feature selection, decision tree construction

Possible Projects

Compression Toolkit
- Implement Huffman coding from scratch
- Compare compression ratios on different file types (text, code, random data)
- Visualize Huffman tree construction
- Explore adaptive Huffman (updating codes as data streams)
Scheduling Optimizer
- Implement interval scheduling and variants (weighted intervals, resource constraints)
- Build visualization showing greedy choices
- Compare greedy to optimal DP solution for weighted case
- Apply to real calendar data
Network Design Simulator
- Implement Kruskal’s and Prim’s MST algorithms
- Visualize algorithm execution on random graphs
- Compare performance on different graph structures
- Extend to Steiner tree approximation
Greedy vs. Optimal Explorer
- Build interactive tool showing problems where greedy works vs. fails
- Coin change: US coins (greedy works) vs. custom systems (greedy fails)
- Shortest path: non-negative (Dijkstra works) vs. negative weights (Dijkstra fails)
- Knapsack: fractional (greedy works) vs. 0/1 (greedy fails)
Set Cover Approximation
- Implement greedy Set Cover algorithm
- Measure approximation ratio on random instances
- Compare to integer programming solution
- Apply to real-world covering problems (test suite selection, sensor placement)

Estimated Coverage Time

2-3 lectures depending on depth

1.5 lectures: Core greedy algorithms (activity selection, Huffman, MST, Dijkstra)
2 lectures: + Exchange argument proofs, greedy approximation algorithms
3 lectures: + Matroid theory, advanced greedy techniques, submodular optimization

Common Misconceptions

“Greedy algorithms are just heuristics”
- False! Many greedy algorithms are provably optimal (MST, Dijkstra, Huffman). Others provide approximation guarantees.
“If greedy is optimal for one formulation, it’s always optimal”
- False! Fractional knapsack: greedy optimal. 0/1 knapsack: greedy fails. Coin change: depends on coin system.
“Greedy is always faster than dynamic programming”
- Usually true, but not always. Both can be O(n log n) depending on implementation. Advantage of greedy is simplicity, not just speed.
“Exchange arguments are the only way to prove greedy correctness”
- False! Can also use induction, cut property (for graphs), or matroid theory. Exchange arguments are most common because they’re intuitive.
“Greedy algorithms don’t need fancy data structures”
- False! Efficient implementation often requires priority queues (Dijkstra, Prim’s), Union-Find (Kruskal’s), etc.

What Makes This Problem-Posing?

Traditional (banking model):

“A greedy algorithm makes the locally optimal choice at each stage. Here’s the activity selection algorithm: sort by end time, select non-conflicting meetings. Memorize this. Here’s the proof by exchange argument. Apply to homework problems.”

Problem-posing approach:

“You manage a conference room with conflicting meeting requests. How do you maximize usage? What strategy feels right? Let’s try ’earliest start time’—here’s a counterexample. What about ‘shortest duration’—another counterexample. What if we think about which choice leaves the most flexibility? Ah! ‘Earliest end time’ makes sense—let’s see if we can prove it works…”

The difference: Students discover greedy strategies by exploring what works and what fails. They develop intuition for when to try greedy approaches and how to prove correctness. The exchange argument emerges as a natural way to formalize the intuition that “if optimal solution differs, we can swap to our greedy choice.”

When to Cover This

Curriculum placement:

Week 2-3 for graduate courses (after divide-and-conquer, alongside or before DP)
Week 3-4 for advanced undergraduate

Prerequisites check: Students should understand:

Basic algorithm analysis
Proof by contradiction and induction
Graph terminology (for MST and shortest paths)

Why early: Greedy algorithms are conceptually simple and provide excellent practice for proof techniques (especially exchange arguments). They also motivate dynamic programming—when students see greedy fail (0/1 knapsack, weighted interval scheduling), they ask “what technique handles this?” and DP emerges as the answer.

Pedagogical note: Start with activity selection (clean example with visual intuition). Then Huffman (more complex but compelling application). Then MST (connects to graphs, has multiple valid greedy strategies). Prove at least one algorithm rigorously with exchange argument so students see the technique.

Ready for students? ✓ Yes - strong motivation through real-world applications, clear progression from simple to complex, emphasis on proof techniques, contrasts with DP to clarify when each approach applies.