Skip to content

Commit fd698c8

Browse files
committed
Fixed complexity analysis and added good pivot condition for 3way
1 parent aa48b9c commit fd698c8

File tree

5 files changed

+71
-44
lines changed

5 files changed

+71
-44
lines changed

out/production/src/src/dataStructures/disjointSet/quickFind/quick_find.iml

Lines changed: 0 additions & 7 deletions
This file was deleted.

src/algorithms/sorting/quickSort/lomuto/QuickSort.java

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@
1010
* QuickSort is a divide-and-conquer sorting algorithm. The basic idea behind Quicksort is to choose a pivot element,
1111
* places it in its correct position in the sorted array, and then recursively sorts the sub-arrays on either side of
1212
* the pivot. When we introduce randomization in pivot selection, every element has equal probability of being
13-
* selected as the pivot. This means the chance of the smallest or largest element getting chosen as the pivot is
14-
* decreased, so we reduce the probability of encountering the worst-case scenario.
13+
* selected as the pivot. This means the chance of an extreme element getting chosen as the pivot is decreased, so we
14+
* reduce the probability of encountering the worst-case scenario of imbalanced partitioning.
1515
*
1616
* Implementation Invariant:
1717
* The pivot is in the correct position, with elements to its left being <= it, and elements to its right being > it.
@@ -22,9 +22,9 @@
2222
*
2323
* Complexity Analysis:
2424
* Time:
25-
* - Worst case (poor choice of pivot): O(n^2)
26-
* - Average case: O(nlogn)
27-
* - Best case (balanced pivot): O(nlogn)
25+
* - Expected worst case (poor choice of pivot): O(n^2)
26+
* - Expected average case: O(nlogn)
27+
* - Expected Best case (balanced pivot): O(nlogn)
2828
*
2929
* In the best case of a balanced pivot, the partitioning process divides the array in half, which leads to log n
3030
* levels of recursion. Given a sub-array of length m, the time complexity of the partition subroutine is O(m) as we
@@ -34,10 +34,6 @@
3434
* Even in the average case where the chosen pivot partitions the array by a fraction, there will still be log n levels
3535
* of recursion. (e.g. T(n) = T(n/10) + T(9n/10) + O(n) => O(nlogn))
3636
*
37-
* In the worst case where the pivot selected is consistently the smallest or biggest element in the array, the
38-
* partitioning of the array around the pivot will be extremely unbalanced, leading to a recurrence relation of:
39-
* T(n) = T(n-1) + O(n) => O(n^2). We have reduced the likelihood of this happening by randomising pivot selection.
40-
*
4137
* However, if there are many duplicates in the array, e.g. {1, 1, 1, 1}, the 1st pivot will be placed in the 3rd idx,
4238
* and 2nd pivot in 2nd idx, 3rd pivot in the 1st idx and 4th pivot in the 0th idx. As we observe, the presence of many
4339
* duplicates in the array leads to extremely unbalanced partitioning, leading to a O(n^2) time complexity.

src/algorithms/sorting/quickSort/paranoid/QuickSort.java

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -9,23 +9,23 @@
99
* This is basically Lomuto's QuickSort, with an additional check to guarantee a good pivot.
1010
*
1111
* Complexity Analysis:
12-
* Time:
13-
* - Worst case: does not terminate
14-
* - Average case: O(nlogn)
15-
* - Best case: O(nlogn)
12+
* Time: (this analysis assumes the absence of many duplicates in our array)
13+
* - Expected worst case: O(nlogn)
14+
* - Expected average case: O(nlogn)
15+
* - Expected best case: O(nlogn)
1616
*
17-
* The additional check to guarantee a good pivot guards against the worst case scenario where the chosen pivot is
18-
* the smallest or biggest element in the array, leading to an extremely unbalanced partitioning. Since the chosen
19-
* pivot has to at least partition the array into a 1/10, 9/10 split, the recurrence relation will be:
20-
* T(n) = T(n/10) + T(9n/10) + n(# iterations of pivot selection).
17+
* The additional check to guarantee a good pivot guards against the worst case scenario where the chosen pivot results
18+
* in an extremely imbalanced partitioning. Since the chosen pivot has to at least partition the array into a
19+
* 1/10, 9/10 split, the recurrence relation will be: T(n) = T(n/10) + T(9n/10) + n(# iterations of pivot selection).
2120
*
2221
* The number of iterations of pivot selection is expected to be <2 (more precisely, 1.25). This is because
2322
* P(good pivot) = 8/10. Expected number of tries to get a good pivot = 1 / P(good pivot) = 10/8 = 1.25.
2423
*
25-
* Therefore, the average time-complexity is: T(n) = T(n/10) + T(9n/10) + 1.25n => O(nlogn).
24+
* Therefore, the expected time-complexity is: T(n) = T(n/10) + T(9n/10) + 1.25n => O(nlogn).
2625
*
27-
* However, the presence of this additional check and repeating pivot selection means that if we have an array of
28-
* length n >= 10 containing all duplicates of the same number, any pivot we pick will be a bad pivot and we will
26+
* Edge case: does not terminate
27+
* The presence of this additional check and repeating pivot selection means that if we have an array of
28+
* length n >= 10 containing all/many duplicates of the same number, any pivot we pick will be a bad pivot and we will
2929
* enter an infinite loop of repeating pivot selection.
3030
*
3131
* Space:
@@ -119,13 +119,12 @@ private static int random(int start, int end) {
119119

120120
/**
121121
* Checks if the given pivot index is a good pivot for the QuickSort algorithm.
122-
* A good pivot is defined as an index that helps avoid worst-case behavior in QuickSort.
122+
* A good pivot helps avoid worst-case behavior in QuickSort.
123+
*
124+
* For arrays of length greater than or equal to 10, a good pivot leaves at least 1/10th of the array on each side.
123125
*
124-
* For arrays of length greater than or equal to 10, a good pivot is an index that leaves at least
125-
* 1/10th of the array on each side.
126-
* *
127126
* If n < 10, such a pivot condition would be meaningless, therefore always return true. This would cause
128-
* the worst case recurrence relation to be T(n) = T(n-1) + O(n) => O(n^2) for small subarrays, but the overall
127+
* the worst case recurrence relation to be T(n) = T(n-1) + O(n) => O(n^2) for small sub-arrays, but the overall
129128
* asymptotic time complexity of Paranoid QuickSort is still O(nlogn).
130129
*
131130
* @param pIdx The index to be checked for being a good pivot.

src/algorithms/sorting/quickSort/threeWayPartitioning/QuickSort.java

Lines changed: 49 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
package src.algorithms.sorting.quickSort.threeWayPartitioning;
22

33
/**
4-
* Here, we are implementing QuickSort with three-way partitioning where we sort the array in increasing (or more
5-
* precisely, non-decreasing) order.
4+
* Here, we are implementing Paranoid QuickSort with three-way partitioning where we sort the array in increasing (or
5+
* more precisely, non-decreasing) order.
66
*
77
* Three-way partitioning is used in QuickSort to tackle the scenario where there are many duplicate elements in the
88
* array being sorted.
@@ -17,19 +17,14 @@
1717
*
1818
* Complexity Analysis:
1919
* Time:
20-
* - Worst case (poor choice of pivot): O(n^2)
20+
* - Worst case: O(nlogn)
2121
* - Average case: O(nlogn)
2222
* - Best case: O(nlogn)
2323
*
2424
* By isolating the elements equal to the pivot into their correct positions during the partitioning step, three-way
2525
* partitioning efficiently handles duplicates, preventing the presence of many duplicates in the array from causing
2626
* the time complexity of QuickSort to degrade to O(n^2).
2727
*
28-
* In the worst case where the pivot selected is consistently the smallest or biggest element in the array, the
29-
* partitioning of the array around the pivot will be extremely unbalanced, leading to a recurrence relation of:
30-
* T(n) = T(n-1) + O(n) => O(n^2). However, the likelihood of this happening is extremely low since pivot selection is
31-
* randomised.
32-
*
3328
* Space:
3429
* - O(1) since sorting is done in-place
3530
*/
@@ -56,9 +51,13 @@ public static void sort(int[] arr) {
5651
public static void quickSort(int[] arr, int start, int end) {
5752
if (start < end) {
5853
int[] newIdx = partition(arr, start, end);
59-
if (newIdx != null) {
60-
quickSort(arr, start, newIdx[0]);
61-
quickSort(arr, newIdx[1], end);
54+
if (isGoodPivot(newIdx[0], newIdx[1], start, end)) {
55+
if (newIdx != null) {
56+
quickSort(arr, start, newIdx[0]);
57+
quickSort(arr, newIdx[1], end);
58+
}
59+
} else {
60+
quickSort(arr, start, end);
6261
}
6362
}
6463
}
@@ -144,4 +143,43 @@ private static int random(int start, int end) {
144143
return (int) (Math.random() * (end - start + 1)) + start;
145144
}
146145

146+
/**
147+
* Checks if the pivot is a good pivot for the QuickSort algorithm.
148+
* A good pivot helps avoid worst-case behavior in QuickSort.
149+
*
150+
* Since we have three-way partitioning, we cannot use 1/10, 9/10 split of the array as our good pivot condition.
151+
* Note that our goal here is to ensure the sizes of the sub-arrays QuickSort is to recurse on are roughly the same
152+
* to ensure that our partitioning is not too imbalanced. The pivot condition we chose is: the larger sub-array can
153+
* be at most 9 times the size of the smaller sub-array.
154+
*
155+
* If n < 10, such a pivot condition would be meaningless, therefore always return true. This would cause
156+
* the worst case recurrence relation to be T(n) = T(n-1) + O(n) => O(n^2) for small sub-arrays, but the overall
157+
* asymptotic time complexity of Paranoid QuickSort is still O(nlogn).
158+
*
159+
* For an all-duplicates array, all pivots will be considered good pivots, therefore return true.
160+
*
161+
* @param firstPIdx The ending index of the < portion of the sub-array.
162+
* @param secondPIdx The starting index of the > portion of the sub-array.
163+
* @param start The starting index of the current sub-array.
164+
* @param end The ending index of the current sub-array.
165+
* @return True if the given index is a good pivot, false otherwise.
166+
*/
167+
public static boolean isGoodPivot(int firstPIdx, int secondPIdx, int start, int end) {
168+
int n = end - start + 1;
169+
if (firstPIdx >= start || secondPIdx <= end) {
170+
if (end - secondPIdx + 1 > 0) { // avoid division by zero
171+
double ratio = (double) (firstPIdx - start + 1) / (end - secondPIdx + 1);
172+
if (n >= 10) {
173+
return ratio >= 1.0 / 9.0 && ratio <= 9;
174+
} else {
175+
return true;
176+
}
177+
} else { // ratio is infinite, imbalanced partition => bad pivot
178+
return false;
179+
}
180+
} else { // all duplicates array
181+
return true;
182+
}
183+
}
184+
147185
}

test/algorithms/quickSort/threeWayPartitioning/QuickSortTest.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
import java.util.Arrays;
88

9-
import static org.junit.Assert.assertArrayEquals;
9+
import static org.junit.Assert.*;
1010

1111
public class QuickSortTest {
1212

@@ -52,4 +52,5 @@ public void test_selectionSort_shouldReturnSortedArray() {
5252
assertArrayEquals(fifthResult, fifthArray);
5353
assertArrayEquals(sixthResult, sixthArray);
5454
}
55+
5556
}

0 commit comments

Comments
 (0)