You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
\text{best ending at j} = S_j - \min_{0\le t \le j} S_t
459
459
$$
460
460
461
461
So during the scan:
@@ -468,9 +468,10 @@ This is the whole algorithm. In words: keep the lowest floor you’ve ever seen
468
468
469
469
A widely used equivalent form keeps a “best sum ending here” value $E$: set $E \leftarrow \max(x_j,; E+x_j)$ and track a global maximum. It’s the same idea written incrementally: if the running sum ever hurts you, you “reset” and start fresh at the current element.
470
470
471
-
Work the example by hand
471
+
*Walkthrough*
472
+
473
+
Sequence $x = [2,-3,4,-1,2,-5,3]$.
472
474
473
-
Sequence $x = [\,2,,-3,,4,,-1,,2,,-5,,3\,]$.
474
475
Initialize $S=0$, $M=0$, and $\text{best}=-\infty$. Keep the index $t$ where the current $M$ occurred so we can reconstruct the block as $(t+1)..j$.
475
476
476
477
```
@@ -504,43 +505,75 @@ You can picture $S_j$ as a hilly skyline and $M$ as the lowest ground you’ve t
504
505
505
506
```
506
507
prefix S: 0 → 2 → -1 → 3 → 2 → 4 → -1 → 2
507
-
ground M: 0 0 -1 -1 -1 -1 -1 -1
508
-
gap S-M: 0 2 0 4 3 5 0 3
509
-
^ peak gap = 5 here
508
+
ground M: 0 0 -1 -1 -1 -1 -1 -1
509
+
gap S-M: 0 2 0 4 3 5 0 3
510
+
^ peak gap = 5 here
510
511
```
511
512
512
-
Edge cases
513
+
Pseudocode (prefix-floor form):
514
+
515
+
```
516
+
best = -∞ # or x[0] if you require non-empty
517
+
S = 0
518
+
M = 0 # 0 makes empty prefix available
519
+
t = 0 # index where M happened (0 means before first element)
520
+
best_i = best_j = None
521
+
522
+
for j in 1..n:
523
+
S = S + x[j]
524
+
if S - M > best:
525
+
best = S - M
526
+
best_i = t + 1
527
+
best_j = j
528
+
if S < M:
529
+
M = S
530
+
t = j
531
+
532
+
return best, (best_i, best_j)
533
+
```
534
+
535
+
*Edge cases*
513
536
514
537
When all numbers are negative, the best block is the **least negative single element**. The scan handles this automatically because $M$ keeps dropping with every step, so the maximum of $S_j-M$ happens when you take just the largest entry.
515
538
516
539
Empty-block conventions matter. If you define the answer to be strictly nonempty, initialize $\text{best}$ with $x_1$ and $E=x_1$ in the incremental form; if you allow empty blocks with sum $0$, initialize $\text{best}=0$ and $M=0$. Either way, the one-pass logic doesn’t change.
517
540
518
-
Summary
541
+
*Complexity*
519
542
520
543
* Time: $O(n)$
521
544
* Space: $O(1)$
522
545
523
546
### Scheduling themes
524
547
525
-
Two everyday scheduling goals keep popping up. One tries to pack as many non-overlapping intervals as possible, like booking the most meetings in a single room. The other tries to keep lateness under control when jobs have deadlines, like finishing homework so the worst overrun is as small as possible. Both have crisp greedy rules, and both are easy to run by hand once you see them.
548
+
Two classics:
549
+
550
+
- Pick as many non-overlapping intervals as possible (one room, max meetings).
551
+
- Keep maximum lateness small when jobs have deadlines.
552
+
553
+
They’re both greedy—and both easy to run by hand.
526
554
527
555
Imagine you have time intervals on a single line, and you can keep an interval only if it doesn’t overlap anything you already kept. The aim is to keep as many as possible.
A best answer keeps four intervals, for instance $(1,3),(4,7),(8,10),(10,11)$. I wrote $(10,11)$ for clarity even though the original end was $11$; think half-open $[s,e)$ if you want “touching” to be allowed.
565
+
A best answer keeps four intervals, for instance $(1,3),(4,7),(8,10),(10,11)$.
536
566
537
-
Baseline (slow)
567
+
**Baseline (slow)**
538
568
539
569
Try all subsets and keep the largest that has no overlaps. That’s conceptually simple and always correct, but it’s exponential in the number of intervals, which is a non-starter for anything but tiny inputs.
540
570
541
-
**How it works**
571
+
**Greedy rule:**
542
572
543
-
Sort by finishing time, then walk once from earliest finisher to latest. Keep an interval if its start is at least the end time of the last one you kept. Ending earlier leaves more room for the future, and that is the whole intuition.
573
+
Sort by finish time and take what fits.
574
+
575
+
- Scan from earliest finisher to latest.
576
+
- Keep $(s,e)$ iff $s \ge \text{last_end}$; then set $\text{last_end}\leftarrow e$.
544
577
545
578
Sorted by finish:
546
579
@@ -566,14 +599,27 @@ A tiny picture helps the “finish early” idea feel natural:
566
599
567
600
```
568
601
time →
569
-
kept: [1──3) [4───7) [8─10)
570
-
skip: [2────5) [6────9) [9───11)
602
+
kept: [1────3) [4─────7) [8────10)
603
+
skip: [2────5) [6──────9)[9─────11)
571
604
ending earlier leaves more open space to the right
572
605
```
573
606
574
-
Why this works in one sentence: at the first place an optimal schedule would choose a later-finishing interval, swapping in the earlier finisher cannot reduce what still fits afterward, so you can push the optimal schedule to match greedy without losing size.
607
+
Why this works: at the first place an optimal schedule would choose a later-finishing interval, swapping in the earlier finisher cannot reduce what still fits afterward, so you can push the optimal schedule to match greedy without losing size.
575
608
576
-
Complexity
609
+
Handy pseudocode
610
+
611
+
```python
612
+
# Interval scheduling (max cardinality)
613
+
sort intervals by end time
614
+
last_end =-∞
615
+
keep = []
616
+
for (s,e) in intervals:
617
+
if s >= last_end:
618
+
keep.append((s,e))
619
+
last_end = e
620
+
```
621
+
622
+
*Complexity*
577
623
578
624
* Time: $O(n \log n)$ to sort by finishing time; $O(n)$ scan.
579
625
* Space: $O(1)$ (beyond input storage).
@@ -599,11 +645,11 @@ Jobs and deadlines:
599
645
600
646
An optimal schedule is $J_2,J_4, J_1, J_3$. The maximum lateness there is $0$.
601
647
602
-
Baseline (slow)
648
+
**Baseline (slow)**
603
649
604
650
Try all $n!$ orders, compute every job’s completion time and lateness, and take the order with the smallest $L_{\max}$. This explodes even for modest $n$.
605
651
606
-
**How it works**
652
+
**Greedy rule**
607
653
608
654
Order jobs by nondecreasing deadlines (earliest due date first, often called EDD). Fixing any “inversion” where a later deadline comes before an earlier one can only help the maximum lateness, so sorting by deadlines is safe.
Why this works in one sentence: if two adjacent jobs are out of deadline order, swapping them never increases any completion time relative to its own deadline, and strictly improves at least one, so repeatedly fixing these inversions leads to the sorted-by-deadline order with no worse maximum lateness.
690
+
Why this works: if two adjacent jobs are out of deadline order, swapping them never increases any completion time relative to its own deadline, and strictly improves at least one, so repeatedly fixing these inversions leads to the sorted-by-deadline order with no worse maximum lateness.
645
691
646
-
Complexity
692
+
Pseudocode
693
+
694
+
```
695
+
# Minimize L_max (EDD)
696
+
sort jobs by increasing deadline d_j
697
+
t = 0; Lmax = -∞
698
+
for job j in order:
699
+
t += p_j # completion time C_j
700
+
L = t - d_j
701
+
Lmax = max(Lmax, L)
702
+
return order, Lmax
703
+
```
704
+
705
+
*Complexity*
647
706
648
707
* Time: $O(n \log n)$ to sort by deadlines; $O(n)$ evaluation.
649
708
* Space: $O(1)$.
650
709
651
710
### Huffman coding
652
711
653
-
You have symbols that occur with known frequencies $f_i>0$ and $\sum_i f_i=1$. The goal is to assign each symbol a binary codeword so that no codeword is a prefix of another (a prefix code), and the average length
712
+
You have symbols that occur with known frequencies \$f\_i>0\$ and \$\sum\_i f\_i=1\$ (if you start with counts, first normalize by their total). The goal is to assign each symbol a binary codeword so that no codeword is a prefix of another (a **prefix code**, i.e., uniquely decodable without separators), and the average length
654
713
655
714
$$
656
715
\mathbb{E}[L]=\sum_i f_i\,L_i
657
716
$$
658
717
659
-
is as small as possible. Prefix codes exactly correspond to full binary treeswhose leaves are the symbols and whose leaf depths are the codeword lengths $L_i$. The Kraft inequality $\sum_i 2^{-L_i}\le 1$ is the feasibility condition; equality holds for full trees.
718
+
is as small as possible. Prefix codes correspond exactly to **full binary trees** (every internal node has two children) whose leaves are the symbols and whose leaf depths equal the codeword lengths \$L\_i\$. The **Kraft inequality**\$\sum\_i 2^{-L\_i}\le 1\$ characterizes feasibility; equality holds for full trees (so an optimal prefix code “fills” the inequality).
A valid optimal answer will be a prefix code with expected length as small as possible. We will compute the exact minimum and one optimal set of lengths $L_A,dots,L_E$, plus a concrete codebook.
728
+
A valid optimal answer will be a prefix code with expected length as small as possible. We will compute the exact minimum and one optimal set of lengths \$L\_A,\dots,L\_E\$, plus a concrete codebook. (There can be multiple optimal codebooks when there are ties in frequencies; their **lengths** agree, though the exact bitstrings may differ.)
670
729
671
-
Baseline (slow)
730
+
**Baseline**
672
731
673
-
One conceptual baseline is to enumerate all full binary trees with five labeled leaves and pick the one minimizing $\sum f_i\,L_i$. That is correct but explodes combinatorially as the number of symbols grows. A simpler but usually suboptimal baseline is to give every symbol the same length $\lceil \log_2 5\rceil=3$. That fixed-length code has $\mathbb{E}[L]=3$.
732
+
One conceptual baseline is to enumerate all full binary trees with five labeled leaves and pick the one minimizing \$\sum f\_i,L\_i\$. That is correct but explodes combinatorially as the number of symbols grows. A simpler but usually suboptimal baseline is to give every symbol the same length \$\lceil \log\_2 5\rceil=3\$. That fixed-length code has \$\mathbb{E}\[L]=3\$.
674
733
675
-
**How it works**
734
+
**Greedy Approach**
676
735
677
-
Huffman’s rule repeats one tiny step: always merge the two least frequent items. When you merge two “symbols” with weights $p$ and $q$, you create a parent of weight $p+q$. The act of merging adds exactly $p+q$ to the objective $\mathbb{E}[L]$ because every leaf inside those two subtrees becomes one level deeper. Summing over all merges yields the final cost:
736
+
Huffman’s rule repeats one tiny step: always merge the two least frequent items. When you merge two “symbols” with weights \$p\$ and \$q\$, you create a parent of weight \$p+q\$. **Why does this change the objective by exactly \$p+q\$?** Every leaf in those two subtrees increases its depth (and thus its code length) by \$1\$, so the total increase in \$\sum f\_i L\_i\$ is \$\sum\_{\ell\in\text{subtrees}} f\_\ell\cdot 1=(p+q)\$ by definition of \$p\$ and \$q\$. Summing over all merges yields the final cost:
The greedy choice is safe because in an optimal tree the two deepest leaves must be siblings and must be the two least frequent symbols; otherwise swapping depths strictly reduces the cost by at least $f_{\text{heavy}}-f_{\text{light}}>0$. Collapsing those siblings into one pseudo-symbol reduces the problem size without changing optimality, so induction finishes the proof.
742
+
**Why is the greedy choice optimal?** In an optimal tree the two deepest leaves must be siblings; if not, pairing them to be siblings never increases any other depth and strictly reduces cost whenever a heavier symbol is deeper than a lighter one (an **exchange argument**: swapping depths changes the cost by \$f\_{\text{heavy}}-f\_{\text{light}}>0\$ in our favor). Collapsing those siblings into a single pseudo-symbol reduces the problem size without changing optimality, so induction finishes the proof. (Ties can be broken arbitrarily; all tie-breaks achieve the same minimum \$\mathbb{E}\[L]\$.)
684
743
685
-
Start with the multiset $\{0.40, 0.20, 0.20, 0.10, 0.10\}$. At each line, merge the two smallest weights and add their sum to the running cost.
744
+
Start with the multiset \${0.40, 0.20, 0.20, 0.10, 0.10}\$. At each line, merge the two smallest weights and add their sum to the running cost.
@@ -698,79 +757,48 @@ Start with the multiset $\{0.40, 0.20, 0.20, 0.10, 0.10\}$. At each line, merge
698
757
multiset becomes {1.00} (done)
699
758
```
700
759
701
-
So the optimal expected length is $\boxed{\mathbb{E}[L]=2.20}$ bits per symbol. This already beats the naive fixed-length baseline $3$. It also matches the information-theoretic bound $H(f)\le \mathbb{E}[L]<H(f)+1$, since the entropy here is $H\approx 2.122$.
760
+
So the optimal expected length is \$\boxed{\mathbb{E}\[L]=2.20}\$ bits per symbol. This already beats the naive fixed-length baseline \$3\$. It also matches the information-theoretic bound \$H(f)\le \mathbb{E}\[L]\<H(f)+1\$, since the entropy here is \$H\approx 2.1219\$.
702
761
703
762
Now assign actual lengths. Record who merged with whom:
704
763
705
-
* Step 1 merges $D(0.10)$ and $E(0.10)$ → those two become siblings.
706
-
* Step 2 merges $B(0.20)$ and $C(0.20)$ → those two become siblings.
707
-
* Step 3 merges the pair $D\!E(0.20)$ with $A(0.40)$.
708
-
* Step 4 merges the pair from step 3 with the pair $B\!C(0.40)$.
764
+
* Step 1 merges \$D(0.10)\$ and \$E(0.10)\$ → those two become siblings.
765
+
* Step 2 merges \$B(0.20)\$ and \$C(0.20)\$ → those two become siblings.
766
+
* Step 3 merges the pair \$D!E(0.20)\$ with \$A(0.40)\$.
767
+
* Step 4 merges the pair from step 3 with the pair \$B!C(0.40)\$.
709
768
710
-
Depths follow directly:
769
+
Depths follow directly (each merge adds one level to its members):
711
770
712
771
$$
713
-
L_A=2,quad L_B=L_C=2,quad L_D=L_E=3.
772
+
L_A=2,\quad L_B=L_C=2,\quad L_D=L_E=3.
714
773
$$
715
774
716
-
Check the Kraft sum $3\cdot 2^{-2}+2\cdot 2^{-3}=3/4+1/4=1$ and the cost $0.4\cdot2+0.2\cdot2+0.2\cdot2+0.1\cdot3+0.1\cdot3=2.2$.
775
+
Check the Kraft sum \$3\cdot 2^{-2}+2\cdot 2^{-3}=3/4+1/4=1\$ and the cost \$0.4\cdot2+0.2\cdot2+0.2\cdot2+0.1\cdot3+0.1\cdot3=2.2\$.
717
776
718
-
A tidy ASCII tree (weights shown for clarity):
777
+
A tidy tree (weights shown for clarity):
719
778
720
779
```
721
-
[1.00]
722
-
/ \
723
-
[0.60] [0.40]=BC
724
-
/ \ / \
725
-
[0.40]=A [0.20]=DE B C
726
-
/ \
727
-
D E
780
+
[1.00]
781
+
+--0--> [0.60]
782
+
| +--0--> A(0.40)
783
+
| `--1--> [0.20]
784
+
| +--0--> D(0.10)
785
+
| `--1--> E(0.10)
786
+
`--1--> [0.40]
787
+
+--0--> B(0.20)
788
+
`--1--> C(0.20)
728
789
```
729
790
730
-
One concrete codebook arises by reading left edges as 0 and right edges as 1:
731
-
732
-
* $A \mapsto 00$
733
-
* $B \mapsto 10$
734
-
* $C \mapsto 11$
735
-
* $D \mapsto 010$
736
-
* $E \mapsto 011$
737
-
738
-
You can verify the prefix property immediately and recompute $\mathbb{E}[L]$ from these lengths to get $2.20$ again.
739
-
740
-
Complexity
741
-
742
-
* Time: $O(k \log k)$ using a min-heap over $k$ symbol frequencies.
743
-
* Space: $O(k)$ for the heap and $O(k)$ for the resulting tree.
744
-
745
-
### When greedy fails (and how to quantify “not too bad”)
746
-
747
-
The $0\text{–}1$ knapsack with arbitrary weights defeats the obvious density-based rule. A small, dense item can block space needed for a medium-density item that pairs perfectly with a third, leading to a globally superior pack. Weighted interval scheduling similarly breaks the “earliest finish” rule; taking a long, heavy meeting can beat two short light ones that finish earlier.
748
-
749
-
Approximation guarantees rescue several hard problems with principled greedy performance. For set cover on a universe $U$ with $|U|=n$, the greedy rule that repeatedly picks the set covering the largest number of uncovered elements achieves an $H_n$ approximation:
A tight charging argument proves it: each time you cover new elements, charge them equally; no element is charged more than the harmonic sum relative to the optimum’s coverage.
756
-
757
-
Maximizing a nondecreasing submodular set function $f:2^E\to\mathbb{R}_{\ge 0}$ under a cardinality constraint $|S|\le k$ is a crown jewel. Submodularity means diminishing returns:
758
-
759
-
$$
760
-
A\subseteq B, x\notin B \ \Rightarrow\ f(A\cup\{x\})-f(A)\ \ge\ f(B\cup\{x\})-f(B).
761
-
$$
762
-
763
-
The greedy algorithm that adds the element with largest marginal gain at each step satisfies the celebrated bound
One concrete codebook arises by reading left edges as 0 and right edges as 1 (the left/right choice is arbitrary; flipping all bits in a subtree yields an equivalent optimal code):
768
792
769
-
where $S^\star$ is an optimal size-$k$ set. The proof tracks the residual gap $g_i=f(S^\star)-f(S_i)$ and shows
793
+
*\$A \mapsto 00\$
794
+
*\$B \mapsto 10\$
795
+
*\$C \mapsto 11\$
796
+
*\$D \mapsto 010\$
797
+
*\$E \mapsto 011\$
770
798
771
-
$$
772
-
g_{i+1}\ \le\ \Bigl(1-\frac{1}{k}\Bigr)g_i,
773
-
$$
799
+
You can verify the prefix property immediately and recompute \$\mathbb{E}\[L]\$ from these lengths to get \$2.20\$ again. (From these lengths you can also construct the **canonical Huffman code**, which orders codewords lexicographically—useful for compactly storing the codebook.)
774
800
775
-
hence $g_k\le e^{-k/k}g_0=e^{-1}g_0$. Diminishing returns is exactly what makes the greedy increments add up to a constant-factor slice of the unreachable optimum.
801
+
*Complexity*
776
802
803
+
* Time: \$O(k \log k)\$ using a min-heap over \$k\$ symbol frequencies (each of the \$k-1\$ merges performs two extractions and one insertion).
804
+
* Space: \$O(k)\$ for the heap and \$O(k)\$ for the resulting tree (plus \$O(k)\$ for an optional map from symbols to codewords).
0 commit comments