You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: notes/greedy_algorithms.md
+79-22Lines changed: 79 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
## What are greedy algorithms?
1
+
## Greedy Algorithms
2
2
3
3
Greedy methods construct a solution piece by piece, always choosing the currently best-looking option according to a simple rule. The subtlety is not the rule itself but the proof that local optimality extends to global optimality. Two proof tools do most of the work: exchange arguments (you can swap an optimal solution’s first “deviation” back to the greedy choice without harm) and loop invariants (you maintain a statement that pins down exactly what your partial solution guarantees at each step).
4
4
@@ -93,13 +93,13 @@ The last axiom is the heart. It forbids “dead ends” where a smaller feasible
93
93
You’re standing on square 0 of a line of squares $0,1,\dots,n-1$.
94
94
Each square $i$ tells you how far you’re allowed to jump forward from there: a number $a[i]$. From $i$, you can jump to any square $i+1, i+2, \dots, i+a[i]$. The goal is to decide whether you can ever reach the last square, and, if not, what the furthest square is that you can reach.
95
95
96
-
#### Example input and the expected output
96
+
**Example inputs and outputs**
97
97
98
98
Input array: `a = [3, 1, 0, 0, 4, 1]`
99
99
There are 6 squares (0 through 5).
100
100
Correct output: you cannot reach the last square; the furthest you can get is square `3`.
101
101
102
-
#### A slow but obvious approach
102
+
Baseline (slow)
103
103
104
104
Think “paint everything I can reach, one wave at a time.”
105
105
@@ -120,7 +120,7 @@ from 3: can reach {} → no change (a[3]=0)
120
120
done: no new squares → furthest is 3, last is unreachable
121
121
```
122
122
123
-
#### A clean, fast greedy scan
123
+
**How it works**
124
124
125
125
Carry one number as you sweep left to right: `F`, the furthest square you can reach **so far**.
126
126
Rule of thumb:
@@ -148,11 +148,16 @@ i=4: 4 > F → stuck here
148
148
149
149
Final state: `F = 3`, which means the furthest reachable square is 3. Since `F < n-1 = 5`, the last square is not reachable.
150
150
151
+
Summary
152
+
153
+
* Time: $O(n)$ (single left-to-right pass)
154
+
* Space: $O(1)$
155
+
151
156
### Minimum spanning trees
152
157
153
158
You’ve got a connected weighted graph and you want the cheapest way to connect **all** its vertices without any cycles—that’s a minimum spanning tree (MST). Think “one network of cables that touches every building, with the total cost as small as possible.”
154
159
155
-
#### Example input → expected output
160
+
**Example inputs and outputs**
156
161
157
162
Vertices: $V=\{A,B,C,D,E\}$
158
163
@@ -173,18 +178,29 @@ Total weight $=1+2+3+4=10$.
173
178
174
179
You can’t do better: any cheaper set of 4 edges would either miss a vertex or create a cycle.
175
180
176
-
#### A slow, baseline way (what you’d do if time didn’t matter)
181
+
Baseline (slow)
177
182
178
183
Enumerate every spanning tree and pick the one with the smallest total weight. That’s conceptually simple—“try all combinations of $n-1$ edges that connect everything and have no cycles”—but it explodes combinatorially. Even medium graphs have an astronomical number of spanning trees, so this approach is only good as a thought experiment.
179
184
180
-
#### Two fast greedy methods that always work
185
+
**How it works**
181
186
182
187
Both fast methods rely on two facts:
183
188
184
189
***Cut rule (safe to add):** for any cut $(S, V\setminus S)$, the cheapest edge that crosses the cut appears in some MST. Intuition: if your current partial connection is on one side, the cheapest bridge to the other side is never a bad idea.
185
190
***Cycle rule (safe to skip):** in any cycle, the most expensive edge is never in an MST. Intuition: if you already have a loop, drop the priciest link and you’ll still be connected but strictly cheaper.
Sort edges from lightest to heaviest; walk down that list and keep an edge if it connects two **different** components. Stop when you have $n-1$ edges.
189
205
190
206
Sorted edges by weight:
@@ -218,8 +234,19 @@ We’ll keep a running view of the components; initially each vertex is alone.
Total $=10$. Every later edge would create a cycle and is skipped by the cycle rule.
220
236
237
+
Complexity
238
+
239
+
* Time: $O(E \log E)$ to sort edges + near-constant $\alpha(V)$ for DSU unions; often written $O(E \log V)$ since $E\le V^2$.
240
+
* Space: $O(V)$ for disjoint-set structure.
241
+
221
242
### Prim's method
222
243
244
+
**Example inputs and outputs**
245
+
246
+
Same graph and target: produce any MST of total weight $10$.
247
+
248
+
**How it works**
249
+
223
250
Start from any vertex; repeatedly add the lightest edge that leaves the current tree to bring in a new vertex. Stop when all vertices are in.
224
251
225
252
Let’s start from $A$. The “tree” grows one cheapest boundary edge at a time.
@@ -242,11 +269,16 @@ Edges chosen: exactly the same four as Kruskal, total $=10$.
242
269
243
270
Why did step 4 grab a weight-3 edge after we already took a 4? Because earlier that 3 wasn’t **available**—it didn’t cross from the tree to the outside until $C$ joined the tree. Prim never regrets earlier picks because of the cut rule: at each moment it adds the cheapest bridge from “inside” to “outside,” and that’s always safe.
244
271
272
+
Complexity
273
+
274
+
* Time: $O(E \log V)$ with a binary heap and adjacency lists; $O(E + V\log V)$ with a Fibonacci heap.
275
+
* Space: $O(V)$ for keys/parents and visited set.
276
+
245
277
### Shortest paths with non-negative weights
246
278
247
279
You’ve got a weighted graph and a starting node $s$. Every edge has a cost $\ge 0$. The task is to find the cheapest cost to reach every node from $s$, and a cheapest route for each if you want it.
248
280
249
-
#### Example input → expected output
281
+
**Example inputs and outputs**
250
282
251
283
Nodes: $A,B,C,D,E$
252
284
@@ -267,11 +299,11 @@ Correct shortest-path costs from $A$:
267
299
* $d(D)=4$ via $A\!\to\!B\!\to\!D$
268
300
* $d(E)=4$ via $A\!\to\!B\!\to\!C\!\to\!E$
269
301
270
-
#### A slow baseline (what you’d do without the greedy insight)
302
+
Baseline (slow)
271
303
272
304
One safe—but slower—approach is to relax all edges repeatedly until nothing improves. Think of it as “try to shorten paths by one edge at a time, do that $|V|-1$ rounds.” This eventually converges to the true shortest costs, but it touches every edge many times, so its work is about $|V|\cdot|E|$. It also handles negative edges, which is why it has to be cautious and keep looping.
273
305
274
-
#### The greedy method: Dijkstra’s idea
306
+
**How it works**
275
307
276
308
Carry two sets and a distance label for each node.
* Time: $O((V+E)\log V)$ with a binary heap (often written $O(E \log V)$ when $E\ge V$).
381
+
* Space: $O(V)$ for distances, parent pointers, and heap entries.
382
+
346
383
### Maximum contiguous sum
347
384
348
385
You’re given a list of numbers laid out in a line. You may pick one **contiguous** block, and you want that block’s sum to be as large as possible.
349
386
350
-
### Example input and the expected output
387
+
**Example inputs and outputs**
351
388
352
389
Take $x = [\,2,\,-3,\,4,\,-1,\,2,\,-5,\,3\,]$.
353
390
354
391
A best block is $[\,4,\,-1,\,2\,]$. Its sum is $5$.
355
392
So the correct output is “maximum sum $=5$” and one optimal segment is positions $3$ through $5$ (1-based).
356
393
357
-
#### A slow, obvious baseline
394
+
Baseline (slow)
358
395
359
396
Try every possible block and keep the best total. To sum any block $i..j$ quickly, precompute **prefix sums** $S_0=0$ and $S_j=\sum_{k=1}^j x_k$. Then
360
397
@@ -364,7 +401,7 @@ $$
364
401
365
402
Loop over all $j$ and all $i\le j$, compute $S_j-S_{i-1}$, and take the maximum. This is easy to reason about and always correct, but it does $O(n^2)$ block checks.
366
403
367
-
#### A clean one-pass greedy scan
404
+
**How it works**
368
405
369
406
Walk left to right once and carry two simple numbers.
370
407
@@ -434,25 +471,30 @@ When all numbers are negative, the best block is the **least negative single ele
434
471
435
472
Empty-block conventions matter. If you define the answer to be strictly nonempty, initialize $\text{best}$ with $x_1$ and $E=x_1$ in the incremental form; if you allow empty blocks with sum $0$, initialize $\text{best}=0$ and $M=0$. Either way, the one-pass logic doesn’t change.
436
473
474
+
Summary
475
+
476
+
* Time: $O(n)$
477
+
* Space: $O(1)$
478
+
437
479
### Scheduling themes
438
480
439
481
Two everyday scheduling goals keep popping up. One tries to pack as many non-overlapping intervals as possible, like booking the most meetings in a single room. The other tries to keep lateness under control when jobs have deadlines, like finishing homework so the worst overrun is as small as possible. Both have crisp greedy rules, and both are easy to run by hand once you see them.
440
482
441
483
Imagine you have time intervals on a single line, and you can keep an interval only if it doesn’t overlap anything you already kept. The aim is to keep as many as possible.
A best answer keeps four intervals, for instance $(1,3),(4,7),(8,10),(10,11)$. I wrote $(10,11)$ for clarity even though the original end was $11$; think half-open $[s,e)$ if you want “touching” to be allowed.
450
492
451
-
#### A slow baseline
493
+
Baseline (slow)
452
494
453
495
Try all subsets and keep the largest that has no overlaps. That’s conceptually simple and always correct, but it’s exponential in the number of intervals, which is a non-starter for anything but tiny inputs.
454
496
455
-
#### The greedy rule
497
+
**How it works**
456
498
457
499
Sort by finishing time, then walk once from earliest finisher to latest. Keep an interval if its start is at least the end time of the last one you kept. Ending earlier leaves more room for the future, and that is the whole intuition.
458
500
@@ -487,6 +529,11 @@ ending earlier leaves more open space to the right
487
529
488
530
Why this works in one sentence: at the first place an optimal schedule would choose a later-finishing interval, swapping in the earlier finisher cannot reduce what still fits afterward, so you can push the optimal schedule to match greedy without losing size.
489
531
532
+
Complexity
533
+
534
+
* Time: $O(n \log n)$ to sort by finishing time; $O(n)$ scan.
535
+
* Space: $O(1)$ (beyond input storage).
536
+
490
537
### Minimize the maximum lateness
491
538
492
539
Now think of $n$ jobs, all taking the same amount of time (say one unit). Each job $i$ has a deadline $d_i$. When you run them in some order, the completion time of the $k$-th job is $C_k=k$ (since each takes one unit), and its lateness is
@@ -497,7 +544,7 @@ $$
497
544
498
545
Negative values mean you finished early; the quantity to control is the worst lateness $L_{\max}=\max_i L_i$. The goal is to order the jobs so $L_{\max}$ is as small as possible.
499
546
500
-
#### Example input and the desired output
547
+
**Example inputs and outputs**
501
548
502
549
Jobs and deadlines:
503
550
@@ -508,11 +555,11 @@ Jobs and deadlines:
508
555
509
556
An optimal schedule is $J_2,J_4, J_1, J_3$. The maximum lateness there is $0$.
510
557
511
-
#### A slow baseline
558
+
Baseline (slow)
512
559
513
560
Try all $n!$ orders, compute every job’s completion time and lateness, and take the order with the smallest $L_{\max}$. This explodes even for modest $n$.
514
561
515
-
#### The greedy rule
562
+
**How it works**
516
563
517
564
Order jobs by nondecreasing deadlines (earliest due date first, often called EDD). Fixing any “inversion” where a later deadline comes before an earlier one can only help the maximum lateness, so sorting by deadlines is safe.
Why this works in one sentence: if two adjacent jobs are out of deadline order, swapping them never increases any completion time relative to its own deadline, and strictly improves at least one, so repeatedly fixing these inversions leads to the sorted-by-deadline order with no worse maximum lateness.
554
601
602
+
Complexity
603
+
604
+
* Time: $O(n \log n)$ to sort by deadlines; $O(n)$ evaluation.
605
+
* Space: $O(1)$.
606
+
555
607
### Huffman coding
556
608
557
609
You have symbols that occur with known frequencies $f_i>0$ and $\sum_i f_i=1$. The goal is to assign each symbol a binary codeword so that no codeword is a prefix of another (a prefix code), and the average length
@@ -562,7 +614,7 @@ $$
562
614
563
615
is as small as possible. Prefix codes exactly correspond to full binary trees whose leaves are the symbols and whose leaf depths are the codeword lengths $L_i$. The Kraft inequality $\sum_i 2^{-L_i}\le 1$ is the feasibility condition; equality holds for full trees.
564
616
565
-
#### Example input and the target output
617
+
**Example inputs and outputs**
566
618
567
619
Frequencies:
568
620
@@ -572,11 +624,11 @@ $$
572
624
573
625
A valid optimal answer will be a prefix code with expected length as small as possible. We will compute the exact minimum and one optimal set of lengths $L_A,\dots,L_E$, plus a concrete codebook.
574
626
575
-
### A naive way to think about it
627
+
Baseline (slow)
576
628
577
629
One conceptual baseline is to enumerate all full binary trees with five labeled leaves and pick the one minimizing $\sum f_i\,L_i$. That is correct but explodes combinatorially as the number of symbols grows. A simpler but usually suboptimal baseline is to give every symbol the same length $\lceil \log_2 5\rceil=3$. That fixed-length code has $\mathbb{E}[L]=3$.
578
630
579
-
#### The greedy method that is actually optimal
631
+
**How it works**
580
632
581
633
Huffman’s rule repeats one tiny step: always merge the two least frequent items. When you merge two “symbols” with weights $p$ and $q$, you create a parent of weight $p+q$. The act of merging adds exactly $p+q$ to the objective $\mathbb{E}[L]$ because every leaf inside those two subtrees becomes one level deeper. Summing over all merges yields the final cost:
582
634
@@ -641,6 +693,11 @@ One concrete codebook arises by reading left edges as 0 and right edges as 1:
641
693
642
694
You can verify the prefix property immediately and recompute $\mathbb{E}[L]$ from these lengths to get $2.20$ again.
643
695
696
+
Complexity
697
+
698
+
* Time: $O(k \log k)$ using a min-heap over $k$ symbol frequencies.
699
+
* Space: $O(k)$ for the heap and $O(k)$ for the resulting tree.
700
+
644
701
### When greedy fails (and how to quantify “not too bad”)
645
702
646
703
The $0\text{–}1$ knapsack with arbitrary weights defeats the obvious density-based rule. A small, dense item can block space needed for a medium-density item that pairs perfectly with a third, leading to a globally superior pack. Weighted interval scheduling similarly breaks the “earliest finish” rule; taking a long, heavy meeting can beat two short light ones that finish earlier.
0 commit comments