Minimum Spanning Tree

Imagine you need to lay cables to connect five office buildings. You have many possible routes between them, each with a different cost. You want every building to be reachable from every other, but you want to spend as little as possible — and you certainly do not want to lay the same route twice. This is exactly the problem a Minimum Spanning Tree (MST) solves.

An MST connects all vertices in a weighted, undirected graph using a subset of edges whose total weight is as small as possible. MSTs appear throughout software engineering — from network design and clustering algorithms to approximation algorithms for computationally hard problems.

What Is a Spanning Tree?#

A spanning tree of a connected graph is a subgraph that:

  • Includes all V vertices
  • Is connected and acyclic
  • Has exactly V−1 edges

These three properties are interlinked: a subgraph that includes all V vertices and is both connected and acyclic must have exactly V−1 edges. Add one more edge and you create a cycle; remove one edge and the graph becomes disconnected. Before finding the minimum spanning tree, it helps to see that a graph can have many different spanning trees.

Original Graph
Rendering diagram...
Spanning Tree 1
Rendering diagram...
Spanning Tree 2
Rendering diagram...

Both spanning trees above use all four vertices and exactly three edges (V−1 = 3). Since multiple spanning trees exist for the same graph, the challenge is identifying the one with the lowest total weight.

What Is a Minimum Spanning Tree?#

The Minimum Spanning Tree is the spanning tree with the smallest total edge weight. In the example above, Spanning Tree 1 has total weight 7 (4+2+1), while Spanning Tree 2 has total weight 6 (2+3+1). Spanning Tree 2 is the MST.

Original Graph
Rendering diagram...
Minimum Spanning Tree
Rendering diagram...

Real-World Applications#

ApplicationVerticesEdgesGoal
Network CablesBuildingsPossible cable routes + costsMinimize cable length
Road NetworksCitiesPossible roads + construction costsConnect all with min cost
Electrical GridTownsPower line routes + costsMinimize infrastructure
ClusteringData pointsDistances between pointsGroup similar items
Image SegmentationPixelsPixel similaritiesSegment image regions

MST Properties#

These properties are not just theoretical — they directly explain why Prim's and Kruskal's algorithms produce correct results, and they give you a mental framework for reasoning about whether a given spanning tree is truly minimal.

PropertyDescription
V−1 EdgesA spanning tree of V vertices always has exactly V−1 edges — enough to keep the graph connected without forming any cycle
Unique if Weights DistinctWhen all edge weights are different, only one MST exists; equal weights can produce multiple MSTs with the same total cost
Cut PropertyFor any partition of vertices into two non-empty sets, the minimum-weight edge crossing that partition belongs to at least one MST (and to every MST if that edge is unique)
Cycle PropertyFor any cycle in the graph, the maximum-weight edge in that cycle cannot belong to any MST — a cheaper tree always exists by removing it
No CyclesAdding any non-tree edge to a spanning tree creates exactly one cycle; removing the heaviest edge in that cycle restores a spanning tree

The V−1 Edges Property#

A spanning tree must connect V vertices without any cycles. A tree with V vertices has exactly V−1 edges — just enough to maintain connectivity without forming any cycle. If you have fewer edges, the graph is disconnected; if you have more, a cycle exists somewhere. This is why both MST algorithms stop as soon as V−1 edges have been added to the result.

The Cut Property#

A cut partitions all vertices into two disjoint sets S and V−S. Think of it as drawing a line through the graph — any edge that crosses the line (one endpoint in S, the other in V−S) is called a crossing edge. The cut property states:

If the minimum-weight crossing edge of a cut is unique (strictly lighter than all other crossing edges), it appears in every MST. If ties exist, at least one of the tied minimum-weight crossing edges appears in every MST.

Why? Suppose no minimum-weight crossing edge is in some MST T. Pick any such edge e and add it to T — this creates exactly one cycle. That cycle must contain at least one other crossing edge e' (to return from V−S back to S). Since e is a minimum-weight crossing edge, weight(e) ≤ weight(e'). If weight(e) < weight(e'), swapping e' for e produces a lighter spanning tree, contradicting T being an MST. If weight(e) = weight(e'), the swap produces another MST that includes e — contradicting the assumption that no minimum-weight crossing edge is in any MST.

This property is the foundation of both algorithms: Prim's always picks the lightest edge crossing the boundary between the current tree and the rest of the graph, and Kruskal's always picks the globally lightest edge between two different components. Both strategies are correct because the cut property guarantees these choices lead to the MST.

The Cycle Property#

For any cycle in the graph, the maximum-weight edge in that cycle does not belong to any MST.

Why? If the heaviest edge e in a cycle were in the MST, you could remove it — the two subtrees it was connecting remain reachable via the other edges in that cycle. Reconnecting them with any of those cheaper edges produces a spanning tree with strictly lower total weight. Therefore e cannot be part of any MST.

Kruskal's algorithm implicitly applies this property: it skips an edge whenever adding it would close a cycle. The skipped edge is always the heaviest in that cycle, so discarding it is always safe.

Two Classic Algorithms#

Both Prim's and Kruskal's are greedy algorithms — they make the locally optimal choice at each step. Thanks to the cut and cycle properties above, these local choices are guaranteed to produce the globally optimal MST.

AlgorithmApproachTime ComplexityBest For
Prim'sGrow tree from single vertexO((V+E) log V)Dense graphs
Kruskal'sAdd edges by weight, avoid cyclesO(E log E)Sparse graphs

Prim's Algorithm#

Prim's algorithm grows the MST incrementally from a single starting vertex. At each step it selects the minimum-weight edge that connects the current tree to an unvisited vertex, then absorbs that vertex into the tree. The process continues until all vertices have been included.

The key data structure is a min-heap (priority queue): it efficiently tracks all candidate edges — those with one endpoint inside the current tree and the other outside — and always surfaces the cheapest one next. This is why the algorithm is sometimes described as Dijkstra's algorithm adapted for MSTs.

Prim's Algorithm Steps
Rendering diagram...
Example: Starting from A
Rendering diagram...

Prim's Algorithm

Build MST by growing from a single vertex using priority queue

Time:O((V + E) log V)Space:O(V + E)Best:O((V + E) log V)Worst:O((V + E) log V)

Kruskal's Algorithm#

Kruskal's algorithm takes a global view: it sorts all edges by weight and considers them in ascending order, accepting an edge only when adding it would not create a cycle — that is, when its two endpoints are not already connected.

To check connectivity efficiently, Kruskal's relies on a Union-Find (also called Disjoint Set Union) data structure. Union-Find maintains a collection of disjoint sets — one per connected component — and supports two operations: find (which component does a vertex belong to?) and union (merge two components into one). With path compression and union by rank, both operations run in nearly O(1) amortized time, making cycle detection virtually free.

Kruskal's Algorithm Steps
Rendering diagram...
Example
Rendering diagram...

Kruskal's Algorithm

Build MST by adding edges in weight order, using Union-Find to avoid cycles

Time:O(E log E)Space:O(V)Best:O(E log E)Worst:O(E log E)

Prim's vs Kruskal's#

AspectPrim'sKruskal's
ApproachGrow from single vertexProcess edges globally
Data StructurePriority queueUnion-Find + sorted edges
Time ComplexityO((V+E) log V)O(E log E)
Better ForDense graphs (E ≈ V²)Sparse graphs (E ≈ V)
Starting PointNeeds a starting vertexNo starting vertex needed
Edge ProcessingConsiders edges from MSTConsiders all edges globally
ParallelizationHarder to parallelizeEasier (independent edge checks)

Dense vs Sparse — a quick rule of thumb:

  • A dense graph has many edges relative to its vertex count (E ≈ V²). Think of a network where nearly every node is directly connected to every other node.
  • A sparse graph has few edges relative to its vertex count (E much less than V²). Think of a road map where most cities connect to only a handful of neighbors.

For dense graphs, Prim's avoids the cost of sorting all edges upfront, making adjacency-list traversal more efficient. For sparse graphs — or when you already have a flat edge list — Kruskal's simpler structure often wins.

Decision Flowchart#

Rendering diagram...

When the input data is already in edge-list format (e.g., (u, v, weight) tuples), Kruskal's is usually the natural choice — sorting is straightforward and no adjacency structure needs to be built.

Applications#

MST Applications

Practical uses of Minimum Spanning Trees

Given cities and connection costs, find the minimum total cost to connect all cities

Reviewing AI-Generated Code#

MST implementations have a small set of well-known failure patterns. Recognizing them lets you quickly validate — or correct — AI-generated solutions before they cause bugs in production.

What to Verify#

IssueWhat to CheckCommon AI Mistake
Edge CountMST has exactly V-1 edgesNot checking if graph is connected
Cycle PreventionKruskal's needs Union-Find, Prim's uses visited setAdding edge that creates cycle
Graph TypeMST is for undirected weighted graphsApplying to directed graph
Weight HandlingCompare weights correctly (min, not max)Using max-heap instead of min-heap in Prim's
Union-Find OpsCheck find() before union()Union without checking same component

Common AI Pitfalls#

MST Algorithm Bugs

Common mistakes AI makes in MST implementations

Correct Prim's with proper vertex tracking

Questions to Ask AI#

When AI generates MST code, probe it with targeted questions:

  • How does this guarantee the result is a valid spanning tree with exactly V−1 edges?
  • What does this return if the graph is disconnected?
  • Is this Prim's or Kruskal's, and what drove that choice?
  • What is the time complexity, and which step dominates it?

Summary#

ConceptKey Takeaway
Spanning TreeConnects all V vertices with V-1 edges, no cycles
MSTSpanning tree with minimum total edge weight
Prim'sGrow from vertex, use priority queue, O((V+E) log V)
Kruskal'sProcess edges by weight, use Union-Find, O(E log E)
Cut PropertyThe lightest edge crossing any cut is in the MST
Union-FindEfficiently track connected components for Kruskal's
ApplicationsNetwork design, clustering, maze generation, approximation algorithms

Both algorithms are greedy and provably correct — their correctness follows directly from the cut and cycle properties covered in this chapter. In practice, the choice is usually clear from the input format and graph structure: reach for Kruskal's when you have an edge list, and Prim's when you have a dense adjacency list. When using AI tools to generate either algorithm, verify that the result has exactly V−1 edges, that no cycles are present, and that the implementation uses a min-heap (not max-heap) for Prim's or a proper Union-Find for Kruskal's.