Shortest Paths (Dijkstra, Bellman-Ford, Floyd-Warshall)
Algorithms for finding minimum-cost paths in weighted graphs—one of the most practically important problems in computer science. Dijkstra’s algorithm handles non-negative weights efficiently, Bellman-Ford handles negative weights and detects negative cycles, and Floyd-Warshall computes all-pairs shortest paths. These algorithms power navigation systems, network routing, and optimization across countless domains.
Introductory to Advanced
- Dijkstra’s algorithm: Introductory to Intermediate
- Bellman-Ford: Intermediate
- Floyd-Warshall: Intermediate
- Johnson’s algorithm (combining techniques): Advanced
- Recent breakthroughs (negative-weight SSSP): Research-level
The algorithms that get you home: Shortest path algorithms aren’t theoretical—they’re embedded in everyday life:
- GPS navigation (Google Maps, Waze, Apple Maps)
- Network packet routing (BGP, OSPF)
- Robot motion planning
- Resource optimization in operations research
War Story 1 - Google Maps Changed the World: Every time you use Google Maps, Dijkstra’s algorithm (or a variant) runs on a graph with:
- Hundreds of millions of vertices (road intersections)
- Billions of edges (road segments)
- Dynamic weights (current traffic conditions)
The algorithm computes routes in milliseconds. Google optimized this so aggressively that they:
- Precompute distance tables for major landmarks
- Use hierarchical road networks (highways vs. local roads)
- Cache common queries
- Update weights in real-time based on traffic data
Impact: Saves billions of person-hours annually in reduced travel time. Entire industries (ride-sharing, delivery) depend on real-time shortest path computation.
War Story 2 - The Negative Cycle Arbitrage: In finance, currency exchange rates form a weighted directed graph:
- Vertices = currencies (USD, EUR, GBP, JPY)
- Edge (u,v) = exchange rate from u to v
- Edge weight = log(exchange_rate) (transforms multiplicative to additive)
Key insight: A negative cycle in this graph = arbitrage opportunity!
- USD → EUR → GBP → USD with net gain
- Bellman-Ford detects such cycles
- High-frequency trading firms use variants to identify arbitrage
Reality check: In practice, transaction costs and speed limitations make most arbitrage impossible. But the conceptual connection is beautiful: negative cycle detection has direct financial applications.
War Story 3 - The Internet Routing Crisis: In 1997, a misconfigured router caused a BGP routing crisis affecting much of the internet. Problem: routers use shortest path algorithms to direct packets, but incorrect distance information propagated, creating routing loops and black holes.
Solution required:
- Detecting inconsistencies (negative cycle-like behavior)
- Recomputing paths with correct weights
- Protocols to prevent such misconfigurations
Shortest path algorithms literally keep the internet functioning.
- Dijkstra’s Algorithm - Single-source shortest paths with non-negative weights (O(E + V log V) with binary heap)
- Bellman-Ford Algorithm - Single-source shortest paths with negative weights, detects negative cycles (O(VE))
- Floyd-Warshall Algorithm - All-pairs shortest paths via dynamic programming (O(V³))
- BFS for Unweighted Graphs - Special case: O(V + E) shortest paths when all edges have weight 1
- DAG Shortest Paths - O(V + E) using topological sort (handles negative weights in DAGs)
- Johnson’s Algorithm - All-pairs shortest paths faster than Floyd-Warshall for sparse graphs (O(VE + V² log V))
- Bidirectional Dijkstra - Search from both source and target simultaneously
- A Search* - Dijkstra with heuristics for faster goal-directed search
- Contraction Hierarchies - Preprocessing for ultra-fast queries in road networks
- Recent Breakthrough: Bernstein-Nanongkai-Wulff-Nilsen - Negative-weight single-source shortest paths in O(nm^(3/4) log^(O(1)) n) (2022)
Required:
- Graph traversal (BFS, DFS)
- Priority queues / heaps (for Dijkstra)
- Basic dynamic programming (for Floyd-Warshall)
- Graph representations (adjacency lists vs. matrices)
- Big-O notation
Helpful:
- Understanding of greedy algorithms (Dijkstra is greedy)
- Relaxation concept (central to all shortest path algorithms)
- Proof by contradiction (for correctness arguments)
Setup: You’re planning a road trip across the US. You have:
- Cities (vertices)
- Roads with distances (weighted edges)
- Goal: Find shortest route from New York to San Francisco
Naive approach: Try all possible paths—exponential explosion!
Guided discovery:
“What if we keep track of ‘current best distance’ to each city?” “Start at NYC (distance 0). What about NYC’s neighbors?” “If we can reach Philadelphia via NYC in 100 miles, and we later find a 90-mile route, we should update!”
The key concept: Relaxation
if dist[u] + weight(u,v) < dist[v]:
dist[v] = dist[u] + weight(u,v)
parent[v] = u // for path reconstruction
Building intuition:
“Which city should we process next? The one with smallest confirmed distance!” “Why? If we know city A is 200 miles away (shortest possible), and we haven’t processed it yet, we should explore from there before looking at cities 500+ miles away.”
Dijkstra emerges:
- Maintain priority queue of unvisited cities by distance
- Always process city with minimum distance
- Relax all outgoing edges
- Repeat until destination reached
Why it works: Greedy choice (closest unvisited vertex) is safe because all weights are non-negative. Once we’ve processed a vertex, we know its shortest distance.
Real-world complexity: Google Maps handles billions of edges. Optimizations:
- A* with distance heuristic (straight-line distance)
- Bidirectional search
- Hierarchical road networks (express lanes vs. local)
Setup: You’re routing network packets. Normally, links have positive latency (cost). But what if:
- Some links have “negative latency” (caching, compression)
- A cycle of negative edges exists: A → B → C → A with net negative cost
The problem: You could loop forever, reducing cost each iteration!
This is a negative cycle. Shortest path is undefined (−∞).
Dijkstra fails here! The greedy assumption breaks down with negative weights.
Bellman-Ford to the rescue:
Relax all edges V-1 times:
for i = 1 to V-1:
for each edge (u,v):
relax(u, v)
Check for negative cycles:
for each edge (u,v):
if we can still relax:
negative cycle exists!
Why V-1 iterations? Any shortest simple path has at most V-1 edges. After V-1 iterations, all shortest distances are computed (if no negative cycles exist).
The Vth iteration: If we can still improve distances, there’s a negative cycle reachable from source.
Real applications:
- Currency arbitrage detection (as in War Story 2)
- Resource allocation with penalties/bonuses
- Networking protocols with link costs
Setup: You’re designing a routing table for a network with 1000 nodes. For any pair of nodes, you need the shortest path.
Naive approach: Run Dijkstra from each node: O(V · (E + V log V)) = O(V² log V + VE)
Better idea for dense graphs: Floyd-Warshall
The insight: Use dynamic programming on intermediate vertices.
Let dist[i][j][k] = shortest path from i to j using only vertices {1, 2, …, k} as intermediate vertices.
Recurrence:
dist[i][j][k] = min(
dist[i][j][k-1], // don't use vertex k
dist[i][k][k-1] + dist[k][j][k-1] // use vertex k
)
Space optimization: Only need 2D array (current iteration overwrites previous).
Final algorithm:
dist = adjacency matrix (edge weights)
for k in range(V):
for i in range(V):
for j in range(V):
dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
Remarkably simple code for such a powerful result!
Why this matters:
- O(V³) for all pairs vs. O(V³ log V) running Dijkstra V times
- Naturally handles negative edges (detects negative cycles if dist[i][i] < 0)
- Conceptually elegant (pure DP)
Real usage:
- Small to medium networks (routing tables, social networks)
- Dense graphs where E ≈ V²
- When all-pairs information is needed frequently
Optimal decision-making: Many life decisions involve shortest paths on implicit graphs:
- Career paths: states = jobs, edges = transitions, weights = time/effort/satisfaction
- Course planning: states = knowledge, edges = courses, weights = difficulty
- Route planning: literal application
The greedy principle: Dijkstra’s greedy strategy (always process closest vertex) mirrors human heuristics:
- In problem-solving, tackle easiest accessible subproblems first
- In learning, master prerequisites before advanced topics
Negative cycles in life: Recognizing negative cycles = recognizing unproductive patterns:
- Procrastination loops (anxiety → avoidance → more anxiety)
- Negative thought spirals
- Unprofitable business models
Bellman-Ford’s cycle detection formalizes “something’s wrong here—this pattern shouldn’t exist.”
- CLRS Ch 24-25 - Comprehensive treatment with detailed proofs
- Kleinberg & Tardos §4.4-4.6 - Excellent problem-motivated approach
- Skiena §8 - War stories: real-world shortest path applications
- Erickson Ch 9 - Clean presentation with focus on intuition
- DPV §4.4-4.7 - Concise, emphasizes key ideas
- Sedgewick & Wayne “Algorithms” Ch 4 - Practical implementations with visualizations
- Sanders & Schulz “Algorithm Engineering” - High-performance implementations for large graphs
- VisuAlgo (visualgo.net/en/sssp) - Interactive Dijkstra, Bellman-Ford animations
- Algorithm Visualizer - Step-by-step execution
- USF Algorithm Visualizations - Clear priority queue visualization for Dijkstra
- MIT 6.006 (Demaine) - Classic shortest paths lectures
- Tim Roughgarden “Algorithms Illuminated” - Detailed derivations and proofs
- William Fiset YouTube series - Clear explanations with code
- Dijkstra (1959) - Original paper introducing the algorithm
- Bernstein et al. (2022) - Breakthrough nearly-linear negative-weight SSSP
- Thorup (1999) - O(V + E) undirected SSSP (non-comparison model)
- Sanders & Schulz (2012) - Highway hierarchies for road networks
- Greedy Algorithms: Dijkstra is a greedy algorithm with correctness proof
- Dynamic Programming: Bellman-Ford and Floyd-Warshall use DP principles
- Graph Traversal: BFS is unweighted shortest paths
- Network Flows: Min-cost max-flow uses shortest path as subroutine
- Approximation Algorithms: Metric TSP uses MST/matching, relates to shortest paths
- NP-Complete Problems: Many reductions use shortest path gadgets
- Distributed Systems: Routing protocols (BGP, OSPF) implement variants
Navigation System Simulator
- Implement Dijkstra with real map data (OpenStreetMap)
- Visualize shortest path computation
- Add A* heuristic for speedup
- Compare performance: Dijkstra vs. bidirectional vs. A*
- Integrate real-time traffic (dynamic edge weights)
Currency Arbitrage Detector
- Implement Bellman-Ford for negative cycle detection
- Use real exchange rate APIs
- Transform multiplicative rates to additive (logarithms)
- Detect arbitrage opportunities
- Account for transaction costs
All-Pairs Distance Calculator
- Implement Floyd-Warshall
- Visualize DP table updates
- Compare to running Dijkstra V times
- Test on different graph densities
- Add path reconstruction
Routing Protocol Simulator
- Implement distance-vector routing (Bellman-Ford-based)
- Simulate network with dynamic link failures
- Show convergence behavior
- Detect routing loops
- Compare to link-state routing (Dijkstra-based)
Algorithm Race
- Benchmark suite comparing shortest path algorithms
- Vary: graph size, density, weight ranges, query patterns
- Measure: time, space, cache performance
- Identify crossover points (when to use which algorithm)
- Visualize performance profiles
3-4 lectures depending on depth
- 2 lectures: Dijkstra (with correctness proof), Bellman-Ford basics
- 3 lectures: + Floyd-Warshall, negative cycle detection, applications
- 4 lectures: + Johnson’s algorithm, A* search, practical optimizations (contraction hierarchies, highway hierarchies)
“Dijkstra works with negative weights if there are no negative cycles”
- False! Dijkstra can fail even without negative cycles. Greedy choice isn’t safe with negative weights.
“Bellman-Ford is always better because it handles negative weights”
- False! It’s O(VE) vs. Dijkstra’s O((E + V) log V). Use Dijkstra when all weights are non-negative.
“Floyd-Warshall is always slower than V calls to Dijkstra”
- False! For dense graphs (E ≈ V²), Floyd-Warshall is O(V³) vs. O(V³ log V) for repeated Dijkstra. Also simpler to implement.
“Shortest paths are unique”
- False! Multiple paths can have the same length. Algorithms find one shortest path, not all.
“BFS finds shortest paths in weighted graphs”
- False! BFS only works for unweighted (or uniform-weight) graphs. Weighted graphs need Dijkstra/Bellman-Ford.
“You can’t have shortest paths with negative edges”
- False! Negative edges are fine. Negative cycles make shortest paths undefined.
Traditional (banking model):
“Dijkstra’s algorithm uses a priority queue. Here’s the pseudocode. Proof of correctness by induction. Memorize the O((E + V) log V) runtime. Bellman-Ford does V-1 relaxations. Floyd-Warshall is three nested loops. Apply to homework graphs.”
Problem-posing approach:
“You need to find the fastest route from NYC to San Francisco. How would you approach this? What if you track ‘best known distance’ to each city and update as you discover better paths? Which city should you explore next—why the closest one? Let’s formalize this… Ah, we’ve discovered Dijkstra! Now what if some roads give you time back (negative weights)—does our strategy still work? Let’s find a counterexample…”
The difference: Students discover why algorithms work the way they do by exploring the problem space. They understand:
- Why relaxation is central (updating best-known distances)
- Why greedy works for non-negative weights (Dijkstra)
- Why we need more careful approaches for negative weights (Bellman-Ford)
- Why all-pairs has special structure enabling DP (Floyd-Warshall)
The design process emerges organically rather than being presented as finished products.
Curriculum placement:
- Week 4-6 for graduate courses (after graph traversal, with or after greedy/DP)
- Week 5-7 for advanced undergraduate
Prerequisites check: Students should understand:
- Graph traversal (BFS as foundation)
- Priority queues (for Dijkstra implementation)
- Dynamic programming basics (for Floyd-Warshall)
- Greedy algorithm correctness proofs (for Dijkstra)
Why this timing:
- Requires graph traversal foundation
- Dijkstra connects to greedy algorithms
- Bellman-Ford/Floyd-Warshall connect to dynamic programming
- Essential for network flows (comes later)
Pedagogical note: Start with unweighted BFS (review), then Dijkstra (weighted generalization), then Bellman-Ford (negative weights), finally Floyd-Warshall (all-pairs). Build complexity gradually. Use visualizations heavily—shortest path algorithms are highly visual.
Ready for students? ✓ Yes - strong real-world motivation, clear progression from simple to complex, multiple entry points for different backgrounds, emphasis on understanding over memorization.
