Shortest Paths (Dijkstra, Bellman-Ford, Floyd-Warshall)

Overview

Algorithms for finding minimum-cost paths in weighted graphs—one of the most practically important problems in computer science. Dijkstra’s algorithm handles non-negative weights efficiently, Bellman-Ford handles negative weights and detects negative cycles, and Floyd-Warshall computes all-pairs shortest paths. These algorithms power navigation systems, network routing, and optimization across countless domains.

Difficulty Level

Introductory to Advanced

Dijkstra’s algorithm: Introductory to Intermediate
Bellman-Ford: Intermediate
Floyd-Warshall: Intermediate
Johnson’s algorithm (combining techniques): Advanced
Recent breakthroughs (negative-weight SSSP): Research-level

Why This Matters

The algorithms that get you home: Shortest path algorithms aren’t theoretical—they’re embedded in everyday life:

GPS navigation (Google Maps, Waze, Apple Maps)
Network packet routing (BGP, OSPF)
Robot motion planning
Resource optimization in operations research

War Story 1 - Google Maps Changed the World: Every time you use Google Maps, Dijkstra’s algorithm (or a variant) runs on a graph with:

Hundreds of millions of vertices (road intersections)
Billions of edges (road segments)
Dynamic weights (current traffic conditions)

The algorithm computes routes in milliseconds. Google optimized this so aggressively that they:

Precompute distance tables for major landmarks
Use hierarchical road networks (highways vs. local roads)
Cache common queries
Update weights in real-time based on traffic data

Impact: Saves billions of person-hours annually in reduced travel time. Entire industries (ride-sharing, delivery) depend on real-time shortest path computation.

War Story 2 - The Negative Cycle Arbitrage: In finance, currency exchange rates form a weighted directed graph:

Vertices = currencies (USD, EUR, GBP, JPY)
Edge (u,v) = exchange rate from u to v
Edge weight = log(exchange_rate) (transforms multiplicative to additive)

Key insight: A negative cycle in this graph = arbitrage opportunity!

USD → EUR → GBP → USD with net gain
Bellman-Ford detects such cycles
High-frequency trading firms use variants to identify arbitrage

Reality check: In practice, transaction costs and speed limitations make most arbitrage impossible. But the conceptual connection is beautiful: negative cycle detection has direct financial applications.

War Story 3 - The Internet Routing Crisis: In 1997, a misconfigured router caused a BGP routing crisis affecting much of the internet. Problem: routers use shortest path algorithms to direct packets, but incorrect distance information propagated, creating routing loops and black holes.

Solution required:

Detecting inconsistencies (negative cycle-like behavior)
Recomputing paths with correct weights
Protocols to prevent such misconfigurations

Shortest path algorithms literally keep the internet functioning.

Core Algorithms

Essential

Dijkstra’s Algorithm - Single-source shortest paths with non-negative weights (O(E + V log V) with binary heap)
Bellman-Ford Algorithm - Single-source shortest paths with negative weights, detects negative cycles (O(VE))
Floyd-Warshall Algorithm - All-pairs shortest paths via dynamic programming (O(V³))

Intermediate

BFS for Unweighted Graphs - Special case: O(V + E) shortest paths when all edges have weight 1
DAG Shortest Paths - O(V + E) using topological sort (handles negative weights in DAGs)
Johnson’s Algorithm - All-pairs shortest paths faster than Floyd-Warshall for sparse graphs (O(VE + V² log V))

Advanced

Bidirectional Dijkstra - Search from both source and target simultaneously
A Search* - Dijkstra with heuristics for faster goal-directed search
Contraction Hierarchies - Preprocessing for ultra-fast queries in road networks
Recent Breakthrough: Bernstein-Nanongkai-Wulff-Nilsen - Negative-weight single-source shortest paths in O(nm^(3/4) log^(O(1)) n) (2022)

Prerequisites

Required:
- Graph traversal (BFS, DFS)
- Priority queues / heaps (for Dijkstra)
- Basic dynamic programming (for Floyd-Warshall)
- Graph representations (adjacency lists vs. matrices)
- Big-O notation
Helpful:
- Understanding of greedy algorithms (Dijkstra is greedy)
- Relaxation concept (central to all shortest path algorithms)
- Proof by contradiction (for correctness arguments)

Problem-Posing Approaches

Story 1: The Road Trip Optimizer

Setup: You’re planning a road trip across the US. You have:

Cities (vertices)
Roads with distances (weighted edges)
Goal: Find shortest route from New York to San Francisco

Naive approach: Try all possible paths—exponential explosion!

Guided discovery:

“What if we keep track of ‘current best distance’ to each city?” “Start at NYC (distance 0). What about NYC’s neighbors?” “If we can reach Philadelphia via NYC in 100 miles, and we later find a 90-mile route, we should update!”

The key concept: Relaxation

if dist[u] + weight(u,v) < dist[v]:
    dist[v] = dist[u] + weight(u,v)
    parent[v] = u  // for path reconstruction

Building intuition:

“Which city should we process next? The one with smallest confirmed distance!” “Why? If we know city A is 200 miles away (shortest possible), and we haven’t processed it yet, we should explore from there before looking at cities 500+ miles away.”

Dijkstra emerges:

Maintain priority queue of unvisited cities by distance
Always process city with minimum distance
Relax all outgoing edges
Repeat until destination reached

Why it works: Greedy choice (closest unvisited vertex) is safe because all weights are non-negative. Once we’ve processed a vertex, we know its shortest distance.

Real-world complexity: Google Maps handles billions of edges. Optimizations:

A* with distance heuristic (straight-line distance)
Bidirectional search
Hierarchical road networks (express lanes vs. local)

Story 2: The Time Travel Paradox

Setup: You’re routing network packets. Normally, links have positive latency (cost). But what if:

Some links have “negative latency” (caching, compression)
A cycle of negative edges exists: A → B → C → A with net negative cost

The problem: You could loop forever, reducing cost each iteration!

This is a negative cycle. Shortest path is undefined (−∞).

Dijkstra fails here! The greedy assumption breaks down with negative weights.

Bellman-Ford to the rescue:

Relax all edges V-1 times:
    for i = 1 to V-1:
        for each edge (u,v):
            relax(u, v)

Check for negative cycles:
    for each edge (u,v):
        if we can still relax:
            negative cycle exists!

Why V-1 iterations? Any shortest simple path has at most V-1 edges. After V-1 iterations, all shortest distances are computed (if no negative cycles exist).

The Vth iteration: If we can still improve distances, there’s a negative cycle reachable from source.

Real applications:

Currency arbitrage detection (as in War Story 2)
Resource allocation with penalties/bonuses
Networking protocols with link costs

Story 3: The All-Pairs Challenge

Setup: You’re designing a routing table for a network with 1000 nodes. For any pair of nodes, you need the shortest path.

Naive approach: Run Dijkstra from each node: O(V · (E + V log V)) = O(V² log V + VE)

Better idea for dense graphs: Floyd-Warshall

The insight: Use dynamic programming on intermediate vertices.

Let dist[i][j][k] = shortest path from i to j using only vertices {1, 2, …, k} as intermediate vertices.

Recurrence:

dist[i][j][k] = min(
    dist[i][j][k-1],              // don't use vertex k
    dist[i][k][k-1] + dist[k][j][k-1]  // use vertex k
)

Space optimization: Only need 2D array (current iteration overwrites previous).

Final algorithm:

dist = adjacency matrix (edge weights)
for k in range(V):
    for i in range(V):
        for j in range(V):
            dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])

Remarkably simple code for such a powerful result!

Why this matters:

O(V³) for all pairs vs. O(V³ log V) running Dijkstra V times
Naturally handles negative edges (detects negative cycles if dist[i][i] < 0)
Conceptually elegant (pure DP)

Real usage:

Small to medium networks (routing tables, social networks)
Dense graphs where E ≈ V²
When all-pairs information is needed frequently

Cognitive Applications (from Algorithms to Live By)

Optimal decision-making: Many life decisions involve shortest paths on implicit graphs:

Career paths: states = jobs, edges = transitions, weights = time/effort/satisfaction
Course planning: states = knowledge, edges = courses, weights = difficulty
Route planning: literal application

The greedy principle: Dijkstra’s greedy strategy (always process closest vertex) mirrors human heuristics:

In problem-solving, tackle easiest accessible subproblems first
In learning, master prerequisites before advanced topics

Negative cycles in life: Recognizing negative cycles = recognizing unproductive patterns:

Procrastination loops (anxiety → avoidance → more anxiety)
Negative thought spirals
Unprofitable business models

Bellman-Ford’s cycle detection formalizes “something’s wrong here—this pattern shouldn’t exist.”

Key Resources

Textbooks

CLRS Ch 24-25 - Comprehensive treatment with detailed proofs
Kleinberg & Tardos §4.4-4.6 - Excellent problem-motivated approach
Skiena §8 - War stories: real-world shortest path applications
Erickson Ch 9 - Clean presentation with focus on intuition
DPV §4.4-4.7 - Concise, emphasizes key ideas

Specialized Resources

Sedgewick & Wayne “Algorithms” Ch 4 - Practical implementations with visualizations
Sanders & Schulz “Algorithm Engineering” - High-performance implementations for large graphs

Visualizations

VisuAlgo (visualgo.net/en/sssp) - Interactive Dijkstra, Bellman-Ford animations
Algorithm Visualizer - Step-by-step execution
USF Algorithm Visualizations - Clear priority queue visualization for Dijkstra

Videos & Lectures

MIT 6.006 (Demaine) - Classic shortest paths lectures
Tim Roughgarden “Algorithms Illuminated” - Detailed derivations and proofs
William Fiset YouTube series - Clear explanations with code

Papers & Advanced

Dijkstra (1959) - Original paper introducing the algorithm
Bernstein et al. (2022) - Breakthrough nearly-linear negative-weight SSSP
Thorup (1999) - O(V + E) undirected SSSP (non-comparison model)
Sanders & Schulz (2012) - Highway hierarchies for road networks

Connections to Other Topics

Greedy Algorithms: Dijkstra is a greedy algorithm with correctness proof
Dynamic Programming: Bellman-Ford and Floyd-Warshall use DP principles
Graph Traversal: BFS is unweighted shortest paths
Network Flows: Min-cost max-flow uses shortest path as subroutine
Approximation Algorithms: Metric TSP uses MST/matching, relates to shortest paths
NP-Complete Problems: Many reductions use shortest path gadgets
Distributed Systems: Routing protocols (BGP, OSPF) implement variants

Possible Projects

Navigation System Simulator
- Implement Dijkstra with real map data (OpenStreetMap)
- Visualize shortest path computation
- Add A* heuristic for speedup
- Compare performance: Dijkstra vs. bidirectional vs. A*
- Integrate real-time traffic (dynamic edge weights)
Currency Arbitrage Detector
- Implement Bellman-Ford for negative cycle detection
- Use real exchange rate APIs
- Transform multiplicative rates to additive (logarithms)
- Detect arbitrage opportunities
- Account for transaction costs
All-Pairs Distance Calculator
- Implement Floyd-Warshall
- Visualize DP table updates
- Compare to running Dijkstra V times
- Test on different graph densities
- Add path reconstruction
Routing Protocol Simulator
- Implement distance-vector routing (Bellman-Ford-based)
- Simulate network with dynamic link failures
- Show convergence behavior
- Detect routing loops
- Compare to link-state routing (Dijkstra-based)
Algorithm Race
- Benchmark suite comparing shortest path algorithms
- Vary: graph size, density, weight ranges, query patterns
- Measure: time, space, cache performance
- Identify crossover points (when to use which algorithm)
- Visualize performance profiles

Estimated Coverage Time

3-4 lectures depending on depth

2 lectures: Dijkstra (with correctness proof), Bellman-Ford basics
3 lectures: + Floyd-Warshall, negative cycle detection, applications
4 lectures: + Johnson’s algorithm, A* search, practical optimizations (contraction hierarchies, highway hierarchies)

Common Misconceptions

“Dijkstra works with negative weights if there are no negative cycles”
- False! Dijkstra can fail even without negative cycles. Greedy choice isn’t safe with negative weights.
“Bellman-Ford is always better because it handles negative weights”
- False! It’s O(VE) vs. Dijkstra’s O((E + V) log V). Use Dijkstra when all weights are non-negative.
“Floyd-Warshall is always slower than V calls to Dijkstra”
- False! For dense graphs (E ≈ V²), Floyd-Warshall is O(V³) vs. O(V³ log V) for repeated Dijkstra. Also simpler to implement.
“Shortest paths are unique”
- False! Multiple paths can have the same length. Algorithms find one shortest path, not all.
“BFS finds shortest paths in weighted graphs”
- False! BFS only works for unweighted (or uniform-weight) graphs. Weighted graphs need Dijkstra/Bellman-Ford.
“You can’t have shortest paths with negative edges”
- False! Negative edges are fine. Negative cycles make shortest paths undefined.

What Makes This Problem-Posing?

Traditional (banking model):

“Dijkstra’s algorithm uses a priority queue. Here’s the pseudocode. Proof of correctness by induction. Memorize the O((E + V) log V) runtime. Bellman-Ford does V-1 relaxations. Floyd-Warshall is three nested loops. Apply to homework graphs.”

Problem-posing approach:

“You need to find the fastest route from NYC to San Francisco. How would you approach this? What if you track ‘best known distance’ to each city and update as you discover better paths? Which city should you explore next—why the closest one? Let’s formalize this… Ah, we’ve discovered Dijkstra! Now what if some roads give you time back (negative weights)—does our strategy still work? Let’s find a counterexample…”

The difference: Students discover why algorithms work the way they do by exploring the problem space. They understand:

Why relaxation is central (updating best-known distances)
Why greedy works for non-negative weights (Dijkstra)
Why we need more careful approaches for negative weights (Bellman-Ford)
Why all-pairs has special structure enabling DP (Floyd-Warshall)

The design process emerges organically rather than being presented as finished products.

When to Cover This

Curriculum placement:

Week 4-6 for graduate courses (after graph traversal, with or after greedy/DP)
Week 5-7 for advanced undergraduate

Prerequisites check: Students should understand:

Graph traversal (BFS as foundation)
Priority queues (for Dijkstra implementation)
Dynamic programming basics (for Floyd-Warshall)
Greedy algorithm correctness proofs (for Dijkstra)

Why this timing:

Requires graph traversal foundation
Dijkstra connects to greedy algorithms
Bellman-Ford/Floyd-Warshall connect to dynamic programming
Essential for network flows (comes later)

Pedagogical note: Start with unweighted BFS (review), then Dijkstra (weighted generalization), then Bellman-Ford (negative weights), finally Floyd-Warshall (all-pairs). Build complexity gradually. Use visualizations heavily—shortest path algorithms are highly visual.

Ready for students? ✓ Yes - strong real-world motivation, clear progression from simple to complex, multiple entry points for different backgrounds, emphasis on understanding over memorization.