|
| 1 | +<h1 align='center'>Find - Median - In a - Stream</h1> |
| 2 | + |
| 3 | +## Problem Statement |
| 4 | + |
| 5 | +**Problem URL :** [Find Median In a Stream](https://www.geeksforgeeks.org/problems/find-median-in-a-stream-1587115620/1) |
| 6 | + |
| 7 | + |
| 8 | + |
| 9 | + |
| 10 | + |
| 11 | +## Problem Explanation |
| 12 | +The task is to find the **median** of a stream of integers. The median is the middle value in a sorted list of numbers. If the list has an odd number of elements, the median is the middle one. If the list has an even number of elements, the median is the average of the two middle numbers. |
| 13 | + |
| 14 | +For example: |
| 15 | +1. **Input**: [2, 3, 4] |
| 16 | + **Output**: 3 |
| 17 | + The sorted stream is [2, 3, 4], and 3 is the median. |
| 18 | + |
| 19 | +2. **Input**: [2, 3, 4, 5] |
| 20 | + **Output**: (3 + 4) / 2 = 3.5 |
| 21 | + The sorted stream is [2, 3, 4, 5], and the median is the average of the two middle numbers. |
| 22 | + |
| 23 | +The problem asks for an **efficient solution** to find the median as new numbers are added to the stream. A naive approach of sorting the numbers repeatedly would be too slow for large inputs. |
| 24 | + |
| 25 | + |
| 26 | + |
| 27 | +#### **Approach** |
| 28 | +We use the **two-heap approach** to maintain the stream of numbers efficiently: |
| 29 | +1. **Divide the Stream**: |
| 30 | + - Use a **max-heap** to store the smaller half of the numbers. |
| 31 | + - Use a **min-heap** to store the larger half of the numbers. |
| 32 | + |
| 33 | +2. **Maintain Balance**: |
| 34 | + - Ensure the heaps are balanced such that their sizes differ by at most 1. |
| 35 | + - If the max-heap size exceeds the min-heap by more than 1, move the largest element of the max-heap to the min-heap. |
| 36 | + - If the min-heap size exceeds the max-heap, move the smallest element of the min-heap to the max-heap. |
| 37 | + |
| 38 | +3. **Calculate Median**: |
| 39 | + - If both heaps are of the same size, the median is the average of their tops. |
| 40 | + - Otherwise, the median is the top of the max-heap (since the max-heap will always contain one extra element if the sizes differ). |
| 41 | + |
| 42 | +## Problem Solution |
| 43 | +```cpp |
| 44 | +class Solution |
| 45 | +{ |
| 46 | + public: |
| 47 | + priority_queue<int> maxHeap; |
| 48 | + priority_queue<int, vector<int>, greater<int>> minHeap; |
| 49 | + |
| 50 | + void insertHeap(int &x) |
| 51 | + { |
| 52 | + if(maxHeap.size() == 0 || x < maxHeap.top()) maxHeap.push(x); |
| 53 | + else minHeap.push(x); |
| 54 | + |
| 55 | + balanceHeaps(); |
| 56 | + } |
| 57 | + |
| 58 | + void balanceHeaps(){ |
| 59 | + if(maxHeap.size() > minHeap.size() + 1){ |
| 60 | + minHeap.push(maxHeap.top()); |
| 61 | + maxHeap.pop(); |
| 62 | + }else if(minHeap.size() > maxHeap.size()){ |
| 63 | + maxHeap.push(minHeap.top()); |
| 64 | + minHeap.pop(); |
| 65 | + } |
| 66 | + } |
| 67 | + |
| 68 | + double getMedian(){ |
| 69 | + if(maxHeap.size() == minHeap.size()) return (minHeap.top() + maxHeap.top()) / 2.0; |
| 70 | + else return maxHeap.top(); |
| 71 | + } |
| 72 | +}; |
| 73 | + |
| 74 | +``` |
| 75 | + |
| 76 | +## Problem Solution Explanation |
| 77 | +Let’s explain the code line by line with examples. |
| 78 | + |
| 79 | +```cpp |
| 80 | +class Solution |
| 81 | +{ |
| 82 | +public: |
| 83 | + // Priority queues to represent maxHeap and minHeap |
| 84 | + priority_queue<int> maxHeap; |
| 85 | + priority_queue<int, vector<int>, greater<int>> minHeap; |
| 86 | +``` |
| 87 | +
|
| 88 | +- **Explanation**: |
| 89 | + - We declare two heaps: |
| 90 | + - `maxHeap`: A max-heap to store the smaller half of the numbers. The largest element in this half will be at the top. |
| 91 | + - `minHeap`: A min-heap to store the larger half of the numbers. The smallest element in this half will be at the top. |
| 92 | +
|
| 93 | +- **Example**: |
| 94 | + After adding `[2, 3, 4]`: |
| 95 | + - `maxHeap`: [2] (top = 2). |
| 96 | + - `minHeap`: [3, 4] (top = 3). |
| 97 | +
|
| 98 | +
|
| 99 | +
|
| 100 | +```cpp |
| 101 | +void insertHeap(int &x) |
| 102 | +{ |
| 103 | + if(maxHeap.size() == 0 || x < maxHeap.top()) |
| 104 | + maxHeap.push(x); // Add the number to maxHeap if it's smaller than or equal to the max of maxHeap |
| 105 | + else |
| 106 | + minHeap.push(x); // Otherwise, add it to minHeap |
| 107 | +``` |
| 108 | + |
| 109 | +- **Explanation**: |
| 110 | + - If `maxHeap` is empty, the first number goes into it. |
| 111 | + - Otherwise, compare `x` with the top of `maxHeap`: |
| 112 | + - If `x` is smaller than or equal to the top of `maxHeap`, it belongs to the smaller half, so add it to `maxHeap`. |
| 113 | + - Otherwise, add it to `minHeap`. |
| 114 | + |
| 115 | +- **Example**: |
| 116 | + Adding `3` to empty heaps: |
| 117 | + - `maxHeap = [3]`, `minHeap = []`. |
| 118 | + |
| 119 | + Adding `5` when `maxHeap = [3]`: |
| 120 | + - `maxHeap.top() = 3`. Since `5 > 3`, add `5` to `minHeap`. |
| 121 | + - Result: `maxHeap = [3]`, `minHeap = [5]`. |
| 122 | + |
| 123 | + |
| 124 | + |
| 125 | +```cpp |
| 126 | +balanceHeaps(); |
| 127 | +``` |
| 128 | + |
| 129 | +- **Explanation**: |
| 130 | + - After inserting a number, the heaps may become unbalanced. |
| 131 | + - Call the `balanceHeaps` function to ensure the difference in sizes between the two heaps is at most 1. |
| 132 | + |
| 133 | + |
| 134 | + |
| 135 | +```cpp |
| 136 | +void balanceHeaps() |
| 137 | +{ |
| 138 | + if(maxHeap.size() > minHeap.size() + 1) |
| 139 | + { |
| 140 | + minHeap.push(maxHeap.top()); |
| 141 | + maxHeap.pop(); |
| 142 | + } |
| 143 | + else if(minHeap.size() > maxHeap.size()) |
| 144 | + { |
| 145 | + maxHeap.push(minHeap.top()); |
| 146 | + minHeap.pop(); |
| 147 | + } |
| 148 | +} |
| 149 | +``` |
| 150 | + |
| 151 | +- **Explanation**: |
| 152 | + - If `maxHeap` has more than one extra element, move the top of `maxHeap` to `minHeap`. |
| 153 | + - If `minHeap` has more elements, move the top of `minHeap` to `maxHeap`. |
| 154 | + |
| 155 | +- **Example**: |
| 156 | + Adding `7` to heaps: |
| 157 | + - Before balancing: `maxHeap = [3, 2]`, `minHeap = [5, 7]`. |
| 158 | + - Balance by moving `5` to `maxHeap`. |
| 159 | + - After balancing: `maxHeap = [5, 3, 2]`, `minHeap = [7]`. |
| 160 | + |
| 161 | + |
| 162 | + |
| 163 | +```cpp |
| 164 | +double getMedian() |
| 165 | +{ |
| 166 | + if(maxHeap.size() == minHeap.size()) |
| 167 | + return (minHeap.top() + maxHeap.top()) / 2.0; |
| 168 | + else |
| 169 | + return maxHeap.top(); |
| 170 | +} |
| 171 | +``` |
| 172 | + |
| 173 | +- **Explanation**: |
| 174 | + - If both heaps are of equal size, the median is the average of their tops. |
| 175 | + - If they are unequal, the median is the top of `maxHeap` because it contains the middle element when the total number of elements is odd. |
| 176 | + |
| 177 | +- **Example**: |
| 178 | + Heaps after adding `[2, 3, 4, 5]`: |
| 179 | + - `maxHeap = [3, 2]`, `minHeap = [4, 5]`. |
| 180 | + - Median = `(3 + 4) / 2 = 3.5`. |
| 181 | + |
| 182 | +### **Step-by-Step Example** |
| 183 | + |
| 184 | +Let’s walk through an example input: `[5, 15, 1, 3]`. |
| 185 | + |
| 186 | +1. **Add `5`**: |
| 187 | + - `maxHeap = [5]`, `minHeap = []`. |
| 188 | + - Median = `5`. |
| 189 | + |
| 190 | +2. **Add `15`**: |
| 191 | + - Add `15` to `minHeap`. |
| 192 | + - Heaps: `maxHeap = [5]`, `minHeap = [15]`. |
| 193 | + - Median = `(5 + 15) / 2 = 10`. |
| 194 | + |
| 195 | +3. **Add `1`**: |
| 196 | + - Add `1` to `maxHeap`. |
| 197 | + - Balance the heaps: Move `5` to `minHeap`. |
| 198 | + - Heaps: `maxHeap = [3, 1]`, `minHeap = [5, 15]`. |
| 199 | + - Median = `5`. |
| 200 | + |
| 201 | +4. **Add `3`**: |
| 202 | + - Add `3` to `maxHeap`. |
| 203 | + - Balance the heaps: Move `5` to `maxHeap`. |
| 204 | + - Heaps: `maxHeap = [5, 3, 1]`, `minHeap = [15]`. |
| 205 | + - Median = `(3 + 5) / 2 = 4`. |
| 206 | + |
| 207 | + |
| 208 | + |
| 209 | +### **Time and Space Complexity** |
| 210 | + |
| 211 | +#### **Time Complexity** |
| 212 | +1. **Insertion**: |
| 213 | + - Pushing into a heap takes \(O(\log n)\). |
| 214 | + - Balancing the heaps involves \(O(\log n)\). |
| 215 | + - Total per insertion: \(O(\log n)\). |
| 216 | + |
| 217 | +2. **Median Calculation**: |
| 218 | + - Fetching the top of the heaps takes \(O(1)\). |
| 219 | + - Total: \(O(1)\). |
| 220 | + |
| 221 | +For \(n\) numbers: |
| 222 | +- **Total Time**: \(O(n \log n)\). |
| 223 | + |
| 224 | + |
| 225 | + |
| 226 | +#### **Space Complexity** |
| 227 | +- Two heaps store \(O(n)\) elements combined. |
| 228 | +- **Total Space**: \(O(n)\). |
| 229 | + |
| 230 | + |
| 231 | + |
| 232 | +### **Additional Recommendations** |
| 233 | +1. **Understand the Two-Heap Approach**: Visualize the heaps to understand how the numbers are distributed and balanced. |
| 234 | +2. **Edge Cases**: Test inputs like: |
| 235 | + - A single number. |
| 236 | + - A stream of equal numbers. |
| 237 | +3. **Practice Similar Problems**: Try variations like "Sliding Window Median" to reinforce your knowledge. |
| 238 | + |
| 239 | +This line-by-line explanation should make the logic and flow of the code clear. Happy learning! 😊 |
| 240 | + |
| 241 | + |
| 242 | + |
| 243 | +### **Step 5: Additional Recommendations** |
| 244 | +1. **Understand the Two-Heap Approach**: Practice using max-heap and min-heap together for solving median problems. This approach is efficient and widely applicable. |
| 245 | +2. **Handle Edge Cases**: Test scenarios like: |
| 246 | + - A single element. |
| 247 | + - A stream with only increasing or decreasing numbers. |
| 248 | +3. **Practice Similar Problems**: Try problems like "Sliding Window Median" to enhance understanding of heap-based solutions. |
| 249 | + |
| 250 | +By following this explanation, you can solve and debug similar problems effectively. Happy coding! 😊 |
0 commit comments