Minimum Spanning Tree Algorithms

In graph theory, a Minimum Spanning Tree (MST) connects all vertices of a weighted, undirected graph with the lowest possible total edge weight, without forming cycles. MSTs are fundamental in network design, clustering, and many optimization problems.

What Is a Minimum Spanning Tree?

Given a connected, weighted, undirected graph G = (V, E), an MST is a subgraph T = (V, E_T) where E_T ⊆ E, T is a tree (i.e., it has |V|‑1 edges and no cycles), and the sum of its edge weights ∑_{e∈E_T} w(e) is minimal among all possible spanning trees.

Core MST Algorithms

Three classic algorithms compute an MST efficiently. Each follows a distinct strategy and has different performance characteristics.

Kruskal’s Algorithm

Kruskal’s algorithm builds the MST by sorting all edges by weight and adding them one by one, provided they do not create a cycle. It uses a disjoint‑set (union‑find) data structure to detect cycles quickly.

  • Sort all edges in non‑decreasing order of weight.
  • Initialize a forest where each vertex is a separate tree.
  • Iterate through the sorted edges:
  •  • If the edge connects two different trees, add it to the MST and union the trees.
  •  • Otherwise, discard the edge (it would form a cycle).
  • Stop when the MST contains |V|‑1 edges.
# Python implementation of Kruskal's algorithm
class DisjointSet:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    def union(self, x, y):
        xr, yr = self.find(x), self.find(y)
        if xr == yr:
            return False
        if self.rank[xr] < self.rank[yr]:
            self.parent[xr] = yr
        elif self.rank[xr] > self.rank[yr]:
            self.parent[yr] = xr
        else:
            self.parent[yr] = xr
            self.rank[xr] += 1
        return True

def kruskal(edges, n):
    # edges: list of (weight, u, v)
    ds = DisjointSet(n)
    mst = []
    total = 0
    for w, u, v in sorted(edges):
        if ds.union(u, v):
            mst.append((u, v, w))
            total += w
            if len(mst) == n - 1:
                break
    return mst, total
// Java implementation of Kruskal's algorithm
import java.util.*;

class Edge implements Comparable<Edge> {
    int u, v, w;
    Edge(int u, int v, int w) { this.u = u; this.v = v; this.w = w; }
    public int compareTo(Edge o) { return Integer.compare(this.w, o.w); }
}

class DSU {
    int[] parent, rank;
    DSU(int n) { parent = new int[n]; rank = new int[n];
        for (int i = 0; i < n; i++) parent[i] = i; }
    int find(int x) { return parent[x] == x ? x : (parent[x] = find(parent[x])); }
    boolean union(int a, int b) {
        int ra = find(a), rb = find(b);
        if (ra == rb) return false;
        if (rank[ra] < rank[rb]) parent[ra] = rb;
        else if (rank[ra] > rank[rb]) parent[rb] = ra;
        else { parent[rb] = ra; rank[ra]++; }
        return true;
    }
}

public class KruskalMST {
    public static List<Edge> kruskal(List<Edge> edges, int n) {
        Collections.sort(edges);
        DSU dsu = new DSU(n);
        List<Edge> mst = new ArrayList<>();
        for (Edge e : edges) {
            if (dsu.union(e.u, e.v)) {
                mst.add(e);
                if (mst.size() == n - 1) break;
            }
        }
        return mst;
    }
}
// C++ implementation of Kruskal's algorithm
#include <bits/stdc++.h>
using namespace std;

struct Edge {int u, v, w;};

struct DSU {
    vector<int> p, r;
    DSU(int n) : p(n), r(n,0) { iota(p.begin(), p.end(), 0); }
    int find(int x){ return p[x]==x?x:p[x]=find(p[x]); }
    bool unite(int a,int b){
        a=find(a); b=find(b);
        if(a==b) return false;
        if(r[a]<r[b]) swap(a,b);
        p[b]=a; if(r[a]==r[b]) r[a]++;
        return true;
    }
};

vector<Edge> kruskal(vector<Edge>& edges, int n){
    sort(edges.begin(), edges.end(), [](const Edge& a,const Edge& b){return a.w<b.w;});
    DSU dsu(n);
    vector<Edge> mst; int total=0;
    for(const auto& e: edges){
        if(dsu.unite(e.u, e.v)){
            mst.push_back(e);
            total+=e.w;
            if(mst.size()==n-1) break;
        }
    }
    // total weight is stored in 'total' if needed
    return mst;
}

Prim’s Algorithm

Prim’s algorithm grows the MST from an arbitrary starting vertex by repeatedly adding the cheapest edge that connects the growing tree to a vertex outside the tree. It is often implemented with a min‑heap (priority queue) for efficiency.

  • Pick an arbitrary start vertex s and add it to the MST set.
  • Initialize a priority queue with all edges incident to s (key = edge weight).
  • While the MST set does not contain all vertices:
  •  • Extract the edge (u, v) with the smallest weight from the queue.
  •  • If v is not yet in the MST set, add v and the edge to the MST.
  •  • Insert all edges incident to v whose opposite endpoint is not yet in the MST.
# Python implementation of Prim's algorithm using heapq
import heapq

def prim(graph, start=0):
    # graph: adjacency list {u: [(v, weight), ...], ...}
    visited = set([start])
    edges = [(w, start, v) for v, w in graph[start]]
    heapq.heapify(edges)
    mst = []
    total = 0
    while edges and len(visited) < len(graph):
        w, u, v = heapq.heappop(edges)
        if v in visited:
            continue
        visited.add(v)
        mst.append((u, v, w))
        total += w
        for to, weight in graph[v]:
            if to not in visited:
                heapq.heappush(edges, (weight, v, to))
    return mst, total
// Java implementation of Prim's algorithm using a priority queue
import java.util.*;

class PrimMST {
    static class Edge implements Comparable<Edge> {
        int to, weight;
        Edge(int to, int weight) { this.to = to; this.weight = weight; }
        public int compareTo(Edge o) { return Integer.compare(this.weight, o.weight); }
    }
    public static List<int[]> prim(List<List<Edge>> adj, int start) {
        int n = adj.size();
        boolean[] inMST = new boolean[n];
        PriorityQueue<int[]> pq = new PriorityQueue<>(Comparator.comparingInt(a -> a[2]));
        List<int[]> mst = new ArrayList<>();
        inMST[start] = true;
        for (Edge e : adj.get(start)) pq.offer(new int[]{start, e.to, e.weight});
        while (!pq.isEmpty() && mst.size() < n-1) {
            int[] cur = pq.poll();
            int u = cur[0], v = cur[1], w = cur[2];
            if (inMST[v]) continue;
            inMST[v] = true;
            mst.add(cur);
            for (Edge e : adj.get(v)) if (!inMST[e.to]) pq.offer(new int[]{v, e.to, e.weight});
        }
        return mst;
    }
}

Borůvka’s Algorithm

Borůvka’s algorithm (also known as Sollin’s algorithm) repeatedly connects each component to its cheapest outgoing edge. It works in O(E log V) time and is especially useful in parallel environments.

  • Initialize each vertex as a separate component.
  • Repeat until only one component remains:
  •  • For every component, find the minimum‑weight edge that connects it to a different component.
  •  • Add all these selected edges to the MST (they cannot create cycles).
  •  • Merge the components joined by the newly added edges.
📝 Note: Borůvka’s algorithm is naturally parallelizable because the cheapest edge for each component can be found independently.

Algorithm Comparison

AlgorithmTime ComplexitySpace ComplexityTypical Use‑Case
Kruskal’sO(E log E) ≈ O(E log V)O(V)Sparse graphs; easy to implement with Union‑Find.
Prim’s (binary heap)O(E log V)O(V)Dense graphs; works well when adjacency list is available.
Prim’s (Fibonacci heap)O(E + V log V)O(V)Theoretical optimum for dense graphs.
Borůvka’sO(E log V)O(V)Parallel processing or when graph has many components early.
⚠ Warning: All three algorithms assume the graph is connected and undirected. Applying them to a disconnected graph will produce a minimum spanning forest instead of a single tree.
💡 Tip: If the graph contains negative edge weights, MST algorithms still work correctly because they never consider cycles; however, ensure the data type used for total weight can represent negative sums.

Practical Implementation Tips

  1. Prefer adjacency lists over adjacency matrices for large, sparse graphs to save memory.
  2. When using Kruskal’s algorithm, path compression and union by rank in the disjoint‑set dramatically improve performance.
  3. For Prim’s algorithm on dense graphs, a simple array (O(V²) implementation) may outperform a heap due to lower constant factors.
  4. Validate input: check for duplicate edges, self‑loops, and ensure the graph is indeed undirected.
📘 Summary: Minimum Spanning Tree algorithms provide efficient ways to connect all vertices of a weighted graph with minimal total cost. Kruskal’s algorithm excels with sparse graphs, Prim’s algorithm is versatile and favours dense graphs, while Borůvka’s algorithm shines in parallel environments. Understanding their complexities, data‑structure requirements, and practical trade‑offs enables developers to select the optimal solution for real‑world network design problems.

Q: Can MST algorithms be used on directed graphs?
A: Standard MST algorithms assume undirected graphs. For directed graphs, the analogous problem is the Minimum Arborescence (Edmonds’ algorithm).


Q: What happens if the graph is not connected?
A: The algorithms will produce a Minimum Spanning Forest, i.e., an MST for each connected component.


Q: Is there a linear‑time MST algorithm?
A: For special graph families (e.g., planar graphs) linear‑time MST algorithms exist, but the general case has a lower bound of Ω(E log V).


Q. Which data structure is essential for achieving O(E log V) time in Kruskal’s algorithm?
  • Binary Search Tree
  • Disjoint Set (Union‑Find)
  • Fibonacci Heap
  • Stack

Answer: Disjoint Set (Union‑Find)
Union‑Find with path compression and union by rank provides near‑constant amortized time for find/union operations, yielding the overall O(E log V) complexity.

Q. In Prim’s algorithm, if a min‑heap is replaced by an unsorted array, what is the new time complexity?
  • O(E log V)
  • O(V²)
  • O(E + V log V)
  • O(E²)

Answer: O(V²)
Extract‑min becomes O(V) and is performed V times, leading to O(V²) overall for dense graphs.

References