Greedy Algorithms

Greedy algorithms form a fundamental paradigm in algorithm design, where a locally optimal choice is made at each step with the hope of reaching a globally optimal solution. This tutorial walks you through the theory, design principles, classic problem instances, implementation details in multiple programming languages, and practical considerations for applying greedy methods in real‑world software engineering.

1. Introduction to Greedy Algorithms

A greedy algorithm builds a solution piece by piece, always choosing the next piece that offers the most immediate benefit. Unlike exhaustive search or dynamic programming, it does not reconsider earlier choices. When the problem exhibits the optimal‑substructure and greedy‑choice property, this simple strategy yields an optimal solution.

Key Properties

  • Optimal‑substructure: optimal solution can be constructed from optimal solutions of subproblems.
  • Greedy‑choice property: a global optimum can be reached by making a locally optimal (greedy) choice.
📝 Note: Not every problem with optimal‑substructure can be solved greedily. Verify the greedy‑choice property before applying the approach.

2. When to Use a Greedy Approach

Greedy algorithms excel when:

  • The problem can be expressed as a sequence of independent decisions.
  • Each decision can be made based solely on information available at that moment.
  • A proof exists that the greedy choice leads to an optimal solution.
⚠ Warning: Applying a greedy strategy without a formal proof often results in sub‑optimal solutions. Always validate the algorithm against counter‑examples.

3. Classic Greedy Problems

3.1 Activity Selection (Interval Scheduling)

Given a set of activities with start and finish times, select the maximum number of non‑overlapping activities.

  • Sort activities by increasing finish time.
  • Iteratively pick the next activity that starts after the last selected one.

3.2 Fractional Knapsack

Items can be broken into fractions. Maximize total value within a weight capacity.

  • Compute value‑to‑weight ratio for each item.
  • Select items in descending order of the ratio, taking whole items until the capacity is reached, then take a fraction of the next item.

3.3 Huffman Coding (Optimal Prefix Codes)

Construct a binary tree to encode symbols with minimal weighted path length.

  • Insert all symbols into a min‑heap keyed by frequency.
  • Repeatedly extract two lowest‑frequency nodes, merge them, and insert the merged node back.

3.4 Dijkstra’s Shortest Path (Non‑negative Edge Weights)

Find the shortest path from a source vertex to all other vertices in a weighted graph with non‑negative edges.

  • Initialize distances, use a priority queue to always expand the vertex with the smallest tentative distance.
  • Relax adjacent edges and update the queue.

4. Designing a Greedy Solution

  1. Understand the problem and define the decision space.
  2. Identify a potential greedy choice (e.g., highest ratio, earliest finish).
  3. Prove the greedy‑choice property (exchange argument, cut‑and‑paste).
  4. Show that the problem exhibits optimal‑substructure.
  5. Implement the algorithm efficiently (often O(n log n) due to sorting or priority queues).
💡 Tip: Start with a brute‑force solution to understand the structure, then look for patterns that suggest a greedy rule.

5. Implementation Examples

5.1 Activity Selection in Python

def activity_selection(activities):
    # activities: list of (start, finish) tuples
    # sort by finish time
    sorted_acts = sorted(activities, key=lambda x: x[1])
    selected = []
    last_finish = -float('inf')
    for start, finish in sorted_acts:
        if start >= last_finish:
            selected.append((start, finish))
            last_finish = finish
    return selected

5.2 Fractional Knapsack in Java

import java.util.*;

class Item {
    double weight, value;
    Item(double w, double v) { weight = w; value = v; }
    double ratio() { return value / weight; }
}

public class FractionalKnapsack {
    public static double maxValue(List<Item> items, double capacity) {
        items.sort((a, b) -> Double.compare(b.ratio(), a.ratio()));
        double total = 0.0;
        for (Item it : items) {
            if (capacity == 0) break;
            if (it.weight <= capacity) {
                total += it.value;
                capacity -= it.weight;
            } else {
                total += it.ratio() * capacity;
                capacity = 0;
            }
        }
        return total;
    }
}

5.3 Huffman Coding in C++

#include <iostream>
#include <queue>
#include <vector>

struct Node {
    char ch; // character, '\0' for internal nodes
    int freq;
    Node *left, *right;
    Node(char c, int f) : ch(c), freq(f), left(nullptr), right(nullptr) {}
    Node(Node* l, Node* r) : ch('\0'), freq(l->freq + r->freq), left(l), right(r) {}
};

struct cmp {
    bool operator()(Node* a, Node* b) { return a->freq > b->freq; }
};

void buildCodes(Node* root, const std::string& prefix, std::unordered_map<char,std::string>& codes) {
    if (!root) return;
    if (root->ch != '\0') codes[root->ch] = prefix;
    buildCodes(root->left, prefix + "0", codes);
    buildCodes(root->right, prefix + "1", codes);
}

int main() {
    std::unordered_map<char,int> freq = {{'a',5},{'b',9},{'c',12},{'d',13},{'e',16},{'f',45}};
    std::priority_queue<Node*, std::vector<Node*>, cmp> pq;
    for (auto& p : freq) pq.push(new Node(p.first, p.second));
    while (pq.size() > 1) {
        Node* left = pq.top(); pq.pop();
        Node* right = pq.top(); pq.pop();
        pq.push(new Node(left, right));
    }
    Node* root = pq.top();
    std::unordered_map<char,std::string> codes;
    buildCodes(root, "", codes);
    for (auto& p : codes) std::cout << p.first << ": " << p.second << '\n';
    return 0;
}

6. Complexity Analysis

ProblemTypical Greedy SolutionTime ComplexitySpace Complexity
Activity SelectionSort + linear scanO(n log n)O(1)
Fractional KnapsackSort by ratio + scanO(n log n)O(1)
Huffman CodingMin‑heap mergingO(n log n)O(n)
Dijkstra (binary heap)Priority queue relaxationO((V+E) log V)O(V)
📝 Note: The dominant term in most greedy algorithms is the sorting or heap operation, giving an O(n log n) baseline.

7. Common Pitfalls and How to Avoid Them

  • Assuming a greedy rule works without proof – always attempt a counter‑example first.
  • Neglecting edge cases such as zero‑weight items in knapsack or duplicate finish times in activity selection.
  • Using an unstable sort when the tie‑breaking rule matters (e.g., equal ratios).
💡 Tip: When in doubt, implement a brute‑force verifier for small inputs and compare results with your greedy solution.

8. Summary

📘 Summary: Greedy algorithms provide elegant, often linear‑or‑logarithmic solutions for a wide class of optimization problems. Their success hinges on two mathematical properties – optimal‑substructure and the greedy‑choice property – which must be proven for each problem. By mastering classic examples, design patterns, and implementation nuances across languages, software engineers can quickly devise efficient solutions that scale to production workloads.

9. Frequently Asked Questions (FAQ)

Q: Can a greedy algorithm be combined with dynamic programming?
A: Yes. Hybrid approaches exist, such as using greedy preprocessing to reduce problem size before applying DP, or employing DP to verify the greedy‑choice property.


Q: Why does Dijkstra’s algorithm fail with negative edge weights?
A: Greedy selection of the smallest tentative distance assumes that once a vertex is settled, its distance is final. Negative edges can later produce a shorter path, violating this assumption.


Q: Is the fractional knapsack problem NP‑hard?
A: No. The fractional version is solvable in polynomial time via a greedy strategy. The 0/1 knapsack (no fractions) is NP‑hard.


10. Quick Quiz

Q. In the activity selection problem, why do we sort by finish time rather than start time?
  • Sorting by start time maximizes the number of activities
  • Finish time sorting guarantees the earliest possible start for remaining activities
  • Start time sorting leads to a lower time complexity
  • Both sorting orders give the same result

Answer: Finish time sorting guarantees the earliest possible start for remaining activities
Choosing the activity that finishes first leaves the maximum remaining time for subsequent selections, which is the essence of the greedy‑choice property.

Q. Which of the following problems cannot be solved optimally by a greedy algorithm?
  • Minimum Spanning Tree (Kruskal’s algorithm)
  • 0/1 Knapsack
  • Huffman Coding
  • Fractional Knapsack

Answer: 0/1 Knapsack
The 0/1 knapsack requires considering combinations of items; a simple greedy rule based on value/weight ratio does not guarantee optimality.

11. Further Reading & References

References

Open-source Greedy Algorithms Library (Python)