Minimum Spanning Tree

Imagine you need to lay cables to connect five office buildings. You have many possible routes between them, each with a different cost. You want every building to be reachable from every other, but you want to spend as little as possible — and you certainly do not want to lay the same route twice. This is exactly the problem a Minimum Spanning Tree (MST) solves.

An MST connects all vertices in a weighted, undirected graph using a subset of edges whose total weight is as small as possible. MSTs appear throughout software engineering — from network design and clustering algorithms to approximation algorithms for computationally hard problems.

What Is a Spanning Tree?#

A spanning tree of a connected graph is a subgraph that:

Includes all V vertices
Is connected and acyclic
Has exactly V−1 edges

These three properties are interlinked: a subgraph that includes all V vertices and is both connected and acyclic must have exactly V−1 edges. Add one more edge and you create a cycle; remove one edge and the graph becomes disconnected. Before finding the minimum spanning tree, it helps to see that a graph can have many different spanning trees.

Original Graph

Rendering diagram...

Spanning Tree 1

Rendering diagram...

Spanning Tree 2

Rendering diagram...

Both spanning trees above use all four vertices and exactly three edges (V−1 = 3). Since multiple spanning trees exist for the same graph, the challenge is identifying the one with the lowest total weight.

What Is a Minimum Spanning Tree?#

The Minimum Spanning Tree is the spanning tree with the smallest total edge weight. In the example above, Spanning Tree 1 has total weight 7 (4+2+1), while Spanning Tree 2 has total weight 6 (2+3+1). Spanning Tree 2 is the MST.

Original Graph

Rendering diagram...

Minimum Spanning Tree

Rendering diagram...

Real-World Applications#

Application	Vertices	Edges	Goal
Network Cables	Buildings	Possible cable routes + costs	Minimize cable length
Road Networks	Cities	Possible roads + construction costs	Connect all with min cost
Electrical Grid	Towns	Power line routes + costs	Minimize infrastructure
Clustering	Data points	Distances between points	Group similar items
Image Segmentation	Pixels	Pixel similarities	Segment image regions

MST Properties#

These properties are not just theoretical — they directly explain why Prim's and Kruskal's algorithms produce correct results, and they give you a mental framework for reasoning about whether a given spanning tree is truly minimal.

Property	Description
V−1 Edges	A spanning tree of V vertices always has exactly V−1 edges — enough to keep the graph connected without forming any cycle
Unique if Weights Distinct	When all edge weights are different, only one MST exists; equal weights can produce multiple MSTs with the same total cost
Cut Property	For any partition of vertices into two non-empty sets, the minimum-weight edge crossing that partition belongs to at least one MST (and to every MST if that edge is unique)
Cycle Property	For any cycle in the graph, the maximum-weight edge in that cycle cannot belong to any MST — a cheaper tree always exists by removing it
No Cycles	Adding any non-tree edge to a spanning tree creates exactly one cycle; removing the heaviest edge in that cycle restores a spanning tree

The V−1 Edges Property#

A spanning tree must connect V vertices without any cycles. A tree with V vertices has exactly V−1 edges — just enough to maintain connectivity without forming any cycle. If you have fewer edges, the graph is disconnected; if you have more, a cycle exists somewhere. This is why both MST algorithms stop as soon as V−1 edges have been added to the result.

The Cut Property#

A cut partitions all vertices into two disjoint sets S and V−S. Think of it as drawing a line through the graph — any edge that crosses the line (one endpoint in S, the other in V−S) is called a crossing edge. The cut property states:

If the minimum-weight crossing edge of a cut is unique (strictly lighter than all other crossing edges), it appears in every MST. If ties exist, at least one of the tied minimum-weight crossing edges appears in every MST.

Why? Suppose no minimum-weight crossing edge is in some MST T. Pick any such edge e and add it to T — this creates exactly one cycle. That cycle must contain at least one other crossing edge e' (to return from V−S back to S). Since e is a minimum-weight crossing edge, weight(e) ≤ weight(e'). If weight(e) < weight(e'), swapping e' for e produces a lighter spanning tree, contradicting T being an MST. If weight(e) = weight(e'), the swap produces another MST that includes e — contradicting the assumption that no minimum-weight crossing edge is in any MST.

This property is the foundation of both algorithms: Prim's always picks the lightest edge crossing the boundary between the current tree and the rest of the graph, and Kruskal's always picks the globally lightest edge between two different components. Both strategies are correct because the cut property guarantees these choices lead to the MST.

The Cycle Property#

For any cycle in the graph, the maximum-weight edge in that cycle does not belong to any MST.

Why? If the heaviest edge e in a cycle were in the MST, you could remove it — the two subtrees it was connecting remain reachable via the other edges in that cycle. Reconnecting them with any of those cheaper edges produces a spanning tree with strictly lower total weight. Therefore e cannot be part of any MST.

Kruskal's algorithm implicitly applies this property: it skips an edge whenever adding it would close a cycle. The skipped edge is always the heaviest in that cycle, so discarding it is always safe.

Two Classic Algorithms#

Both Prim's and Kruskal's are greedy algorithms — they make the locally optimal choice at each step. Thanks to the cut and cycle properties above, these local choices are guaranteed to produce the globally optimal MST.

Algorithm	Approach	Time Complexity	Best For
Prim's	Grow tree from single vertex	O((V+E) log V)	Dense graphs
Kruskal's	Add edges by weight, avoid cycles	O(E log E)	Sparse graphs

Prim's Algorithm#

Prim's algorithm grows the MST incrementally from a single starting vertex. At each step it selects the minimum-weight edge that connects the current tree to an unvisited vertex, then absorbs that vertex into the tree. The process continues until all vertices have been included.

The key data structure is a min-heap (priority queue): it efficiently tracks all candidate edges — those with one endpoint inside the current tree and the other outside — and always surfaces the cheapest one next. This is why the algorithm is sometimes described as Dijkstra's algorithm adapted for MSTs.

Prim's Algorithm Steps

Rendering diagram...

Example: Starting from A

Rendering diagram...

Prim's Algorithm

Build MST by growing from a single vertex using priority queue

Time:O((V + E) log V)Space:O(V + E)Best:O((V + E) log V)Worst:O((V + E) log V)

Kruskal's Algorithm#

Kruskal's algorithm takes a global view: it sorts all edges by weight and considers them in ascending order, accepting an edge only when adding it would not create a cycle — that is, when its two endpoints are not already connected.

To check connectivity efficiently, Kruskal's relies on a Union-Find (also called Disjoint Set Union) data structure. Union-Find maintains a collection of disjoint sets — one per connected component — and supports two operations: find (which component does a vertex belong to?) and union (merge two components into one). With path compression and union by rank, both operations run in nearly O(1) amortized time, making cycle detection virtually free.

Kruskal's Algorithm Steps

Rendering diagram...

Example

Rendering diagram...

Kruskal's Algorithm

Build MST by adding edges in weight order, using Union-Find to avoid cycles

Time:O(E log E)Space:O(V)Best:O(E log E)Worst:O(E log E)

Prim's vs Kruskal's#

Aspect	Prim's	Kruskal's
Approach	Grow from single vertex	Process edges globally
Data Structure	Priority queue	Union-Find + sorted edges
Time Complexity	O((V+E) log V)	O(E log E)
Better For	Dense graphs (E ≈ V²)	Sparse graphs (E ≈ V)
Starting Point	Needs a starting vertex	No starting vertex needed
Edge Processing	Considers edges from MST	Considers all edges globally
Parallelization	Harder to parallelize	Easier (independent edge checks)

Dense vs Sparse — a quick rule of thumb:

A dense graph has many edges relative to its vertex count (E ≈ V²). Think of a network where nearly every node is directly connected to every other node.
A sparse graph has few edges relative to its vertex count (E much less than V²). Think of a road map where most cities connect to only a handful of neighbors.

For dense graphs, Prim's avoids the cost of sorting all edges upfront, making adjacency-list traversal more efficient. For sparse graphs — or when you already have a flat edge list — Kruskal's simpler structure often wins.

Decision Flowchart#

Rendering diagram...

When the input data is already in edge-list format (e.g., (u, v, weight) tuples), Kruskal's is usually the natural choice — sorting is straightforward and no adjacency structure needs to be built.

Applications#

MST Applications

Practical uses of Minimum Spanning Trees

Given cities and connection costs, find the minimum total cost to connect all cities

Reviewing AI-Generated Code#

MST implementations have a small set of well-known failure patterns. Recognizing them lets you quickly validate — or correct — AI-generated solutions before they cause bugs in production.

What to Verify#

Issue	What to Check	Common AI Mistake
Edge Count	MST has exactly V-1 edges	Not checking if graph is connected
Cycle Prevention	Kruskal's needs Union-Find, Prim's uses visited set	Adding edge that creates cycle
Graph Type	MST is for undirected weighted graphs	Applying to directed graph
Weight Handling	Compare weights correctly (min, not max)	Using max-heap instead of min-heap in Prim's
Union-Find Ops	Check find() before union()	Union without checking same component

Common AI Pitfalls#

MST Algorithm Bugs

Common mistakes AI makes in MST implementations

Correct Prim's with proper vertex tracking

Questions to Ask AI#

When AI generates MST code, probe it with targeted questions:

How does this guarantee the result is a valid spanning tree with exactly V−1 edges?
What does this return if the graph is disconnected?
Is this Prim's or Kruskal's, and what drove that choice?
What is the time complexity, and which step dominates it?

Summary#

Concept	Key Takeaway
Spanning Tree	Connects all V vertices with V-1 edges, no cycles
MST	Spanning tree with minimum total edge weight
Prim's	Grow from vertex, use priority queue, O((V+E) log V)
Kruskal's	Process edges by weight, use Union-Find, O(E log E)
Cut Property	The lightest edge crossing any cut is in the MST
Union-Find	Efficiently track connected components for Kruskal's
Applications	Network design, clustering, maze generation, approximation algorithms

Both algorithms are greedy and provably correct — their correctness follows directly from the cut and cycle properties covered in this chapter. In practice, the choice is usually clear from the input format and graph structure: reach for Kruskal's when you have an edge list, and Prim's when you have a dense adjacency list. When using AI tools to generate either algorithm, verify that the result has exactly V−1 edges, that no cycles are present, and that the implementation uses a min-heap (not max-heap) for Prim's or a proper Union-Find for Kruskal's.

PreviousTopological Sort

NextMaximum Flow