Simon Vitt

What is Prim's Algorithm?

Prim's algorithm is a greedy algorithm that finds a minimum spanning tree (MST) for a weighted undirected graph. Developed by Czech mathematician Vojtěch Jarník in 1930 and later rediscovered by Robert C. Prim in 1957, it's one of the fundamental algorithms for finding MSTs.

How It Works

The algorithm starts from an arbitrary node and grows the MST by repeatedly adding the minimum-weight edge that connects a visited node to an unvisited node:

Start with an arbitrary node and mark it as visited
Add all edges from the visited node to unvisited nodes to a priority queue
Select the edge with minimum weight that connects visited to unvisited
Add this edge to the MST and mark the target node as visited
Add new edges from the newly visited node to the priority queue
Repeat until all nodes are visited

Implementation

Here's a Python implementation using a priority queue:

from typing import Dict, List, Tuple
import heapq

def prim(graph: Dict[str, List[Tuple[str, float]]], start: str) -> List[Tuple[str, str, float]]:
    """
    Find the Minimum Spanning Tree (MST) of a connected, undirected, weighted graph.
    
    Uses Prim's algorithm with a priority queue to greedily select the minimum-weight
    edge connecting a visited node to an unvisited node. The result is a tree that
    connects all nodes with minimum total edge weight.
    
    Args:
        graph: Adjacency list representation. Each node maps to a list of (neighbor, weight) tuples.
               Graph must be undirected (edges should appear in both directions).
        start: Starting node identifier.
    
    Returns:
        List of (from_node, to_node, weight) tuples representing MST edges.
    
    Time Complexity: O((V + E) log V) where V is vertices and E is edges.
    Space Complexity: O(V + E) for priority queue and visited set.
    """
    visited: set[str] = {start}
    mst_edges: List[Tuple[str, str, float]] = []
    pq: List[Tuple[float, str, str]] = []
    
    # Add edges from start node
    for neighbor, weight in graph[start]:
        heapq.heappush(pq, (weight, start, neighbor))
    
    while pq and len(visited) < len(graph):
        weight, from_node, to_node = heapq.heappop(pq)
        
        if to_node in visited:
            continue
        
        visited.add(to_node)
        mst_edges.append((from_node, to_node, weight))
        
        # Add edges from newly visited node
        for neighbor, weight in graph[to_node]:
            if neighbor not in visited:
                heapq.heappush(pq, (weight, to_node, neighbor))
    
    return mst_edges

# Example
graph = {'A': [('B', 4), ('C', 2)], 'B': [('A', 4), ('C', 1)],
         'C': [('A', 2), ('B', 1), ('D', 3)], 'D': [('C', 3)]}
print(prim(graph, 'A'))
# [('A', 'C', 2), ('C', 'B', 1), ('C', 'D', 3)]

Time Complexity

The time complexity is O((V + E) log V) using a priority queue, where V is the number of vertices and E is the number of edges. This makes it efficient for finding MSTs in dense graphs.

Real-World Applications

Network design: connecting cities with minimum cable length
Cluster analysis: grouping similar data points
Image segmentation: partitioning images into regions
Circuit design: minimizing wire length in circuit boards
Approximation algorithms: solving traveling salesman problem variants

Kruskal's Algorithm

Kruskal's algorithm solves the same problem (finding a minimum spanning tree) but uses a different approach. Instead of growing the tree from a single node like Prim's, Kruskal's algorithm sorts all edges by weight and adds them one by one, skipping edges that would create cycles.

Here's a Python implementation of Kruskal's algorithm using a union-find data structure:

def find(parent, node):
    if parent[node] != node:
        parent[node] = find(parent, parent[node])
    return parent[node]

def union(parent, rank, x, y):
    root_x = find(parent, x)
    root_y = find(parent, y)
    
    if root_x == root_y:
        return False
    
    if rank[root_x] < rank[root_y]:
        parent[root_x] = root_y
    elif rank[root_x] > rank[root_y]:
        parent[root_y] = root_x
    else:
        parent[root_y] = root_x
        rank[root_x] += 1
    return True

def kruskal(graph):
    edges = []
    for node in graph:
        for neighbor, weight in graph[node]:
            edges.append((weight, node, neighbor))
    
    edges.sort()
    parent = {node: node for node in graph}
    rank = {node: 0 for node in graph}
    mst_edges = []
    
    for weight, from_node, to_node in edges:
        if union(parent, rank, from_node, to_node):
            mst_edges.append((from_node, to_node, weight))
    
    return mst_edges

# Example
graph = {'A': [('B', 4), ('C', 2)], 'B': [('A', 4), ('C', 1)],
         'C': [('A', 2), ('B', 1), ('D', 3)], 'D': [('C', 3)]}
print(kruskal(graph))
# [('C', 'B', 1), ('A', 'C', 2), ('C', 'D', 3)]

Why not visualize Kruskal's here? While Kruskal's algorithm is equally important and efficient (also O(E log E)), Prim's algorithm is more visually intuitive to demonstrate. Prim's grows the tree step-by-step from a starting point, making it easier to follow the algorithm's progress. Kruskal's, on the other hand, processes edges globally and requires tracking connected components, which is less visually engaging but equally powerful in practice.