Skip to content

Commit 672030a

Browse files
authored
Create README.md
1 parent 1902ee7 commit 672030a

File tree

1 file changed

+250
-0
lines changed
  • 19 - Heap Data Structure Problems/16 - Find Median in a Stream

1 file changed

+250
-0
lines changed
Lines changed: 250 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,250 @@
1+
<h1 align='center'>Find - Median - In a - Stream</h1>
2+
3+
## Problem Statement
4+
5+
**Problem URL :** [Find Median In a Stream](https://www.geeksforgeeks.org/problems/find-median-in-a-stream-1587115620/1)
6+
7+
![image](https://github.com/user-attachments/assets/9072d191-dae3-4990-b8e5-ae7b81c01c1b)
8+
![image](https://github.com/user-attachments/assets/fd90f719-56a0-460c-a76b-ece0c3aafad0)
9+
![image](https://github.com/user-attachments/assets/62686092-9d2f-49d9-82b3-3e107412a4b3)
10+
11+
## Problem Explanation
12+
The task is to find the **median** of a stream of integers. The median is the middle value in a sorted list of numbers. If the list has an odd number of elements, the median is the middle one. If the list has an even number of elements, the median is the average of the two middle numbers.
13+
14+
For example:
15+
1. **Input**: [2, 3, 4]
16+
**Output**: 3
17+
The sorted stream is [2, 3, 4], and 3 is the median.
18+
19+
2. **Input**: [2, 3, 4, 5]
20+
**Output**: (3 + 4) / 2 = 3.5
21+
The sorted stream is [2, 3, 4, 5], and the median is the average of the two middle numbers.
22+
23+
The problem asks for an **efficient solution** to find the median as new numbers are added to the stream. A naive approach of sorting the numbers repeatedly would be too slow for large inputs.
24+
25+
26+
27+
#### **Approach**
28+
We use the **two-heap approach** to maintain the stream of numbers efficiently:
29+
1. **Divide the Stream**:
30+
- Use a **max-heap** to store the smaller half of the numbers.
31+
- Use a **min-heap** to store the larger half of the numbers.
32+
33+
2. **Maintain Balance**:
34+
- Ensure the heaps are balanced such that their sizes differ by at most 1.
35+
- If the max-heap size exceeds the min-heap by more than 1, move the largest element of the max-heap to the min-heap.
36+
- If the min-heap size exceeds the max-heap, move the smallest element of the min-heap to the max-heap.
37+
38+
3. **Calculate Median**:
39+
- If both heaps are of the same size, the median is the average of their tops.
40+
- Otherwise, the median is the top of the max-heap (since the max-heap will always contain one extra element if the sizes differ).
41+
42+
## Problem Solution
43+
```cpp
44+
class Solution
45+
{
46+
public:
47+
priority_queue<int> maxHeap;
48+
priority_queue<int, vector<int>, greater<int>> minHeap;
49+
50+
void insertHeap(int &x)
51+
{
52+
if(maxHeap.size() == 0 || x < maxHeap.top()) maxHeap.push(x);
53+
else minHeap.push(x);
54+
55+
balanceHeaps();
56+
}
57+
58+
void balanceHeaps(){
59+
if(maxHeap.size() > minHeap.size() + 1){
60+
minHeap.push(maxHeap.top());
61+
maxHeap.pop();
62+
}else if(minHeap.size() > maxHeap.size()){
63+
maxHeap.push(minHeap.top());
64+
minHeap.pop();
65+
}
66+
}
67+
68+
double getMedian(){
69+
if(maxHeap.size() == minHeap.size()) return (minHeap.top() + maxHeap.top()) / 2.0;
70+
else return maxHeap.top();
71+
}
72+
};
73+
74+
```
75+
76+
## Problem Solution Explanation
77+
Let’s explain the code line by line with examples.
78+
79+
```cpp
80+
class Solution
81+
{
82+
public:
83+
// Priority queues to represent maxHeap and minHeap
84+
priority_queue<int> maxHeap;
85+
priority_queue<int, vector<int>, greater<int>> minHeap;
86+
```
87+
88+
- **Explanation**:
89+
- We declare two heaps:
90+
- `maxHeap`: A max-heap to store the smaller half of the numbers. The largest element in this half will be at the top.
91+
- `minHeap`: A min-heap to store the larger half of the numbers. The smallest element in this half will be at the top.
92+
93+
- **Example**:
94+
After adding `[2, 3, 4]`:
95+
- `maxHeap`: [2] (top = 2).
96+
- `minHeap`: [3, 4] (top = 3).
97+
98+
99+
100+
```cpp
101+
void insertHeap(int &x)
102+
{
103+
if(maxHeap.size() == 0 || x < maxHeap.top())
104+
maxHeap.push(x); // Add the number to maxHeap if it's smaller than or equal to the max of maxHeap
105+
else
106+
minHeap.push(x); // Otherwise, add it to minHeap
107+
```
108+
109+
- **Explanation**:
110+
- If `maxHeap` is empty, the first number goes into it.
111+
- Otherwise, compare `x` with the top of `maxHeap`:
112+
- If `x` is smaller than or equal to the top of `maxHeap`, it belongs to the smaller half, so add it to `maxHeap`.
113+
- Otherwise, add it to `minHeap`.
114+
115+
- **Example**:
116+
Adding `3` to empty heaps:
117+
- `maxHeap = [3]`, `minHeap = []`.
118+
119+
Adding `5` when `maxHeap = [3]`:
120+
- `maxHeap.top() = 3`. Since `5 > 3`, add `5` to `minHeap`.
121+
- Result: `maxHeap = [3]`, `minHeap = [5]`.
122+
123+
124+
125+
```cpp
126+
balanceHeaps();
127+
```
128+
129+
- **Explanation**:
130+
- After inserting a number, the heaps may become unbalanced.
131+
- Call the `balanceHeaps` function to ensure the difference in sizes between the two heaps is at most 1.
132+
133+
134+
135+
```cpp
136+
void balanceHeaps()
137+
{
138+
if(maxHeap.size() > minHeap.size() + 1)
139+
{
140+
minHeap.push(maxHeap.top());
141+
maxHeap.pop();
142+
}
143+
else if(minHeap.size() > maxHeap.size())
144+
{
145+
maxHeap.push(minHeap.top());
146+
minHeap.pop();
147+
}
148+
}
149+
```
150+
151+
- **Explanation**:
152+
- If `maxHeap` has more than one extra element, move the top of `maxHeap` to `minHeap`.
153+
- If `minHeap` has more elements, move the top of `minHeap` to `maxHeap`.
154+
155+
- **Example**:
156+
Adding `7` to heaps:
157+
- Before balancing: `maxHeap = [3, 2]`, `minHeap = [5, 7]`.
158+
- Balance by moving `5` to `maxHeap`.
159+
- After balancing: `maxHeap = [5, 3, 2]`, `minHeap = [7]`.
160+
161+
162+
163+
```cpp
164+
double getMedian()
165+
{
166+
if(maxHeap.size() == minHeap.size())
167+
return (minHeap.top() + maxHeap.top()) / 2.0;
168+
else
169+
return maxHeap.top();
170+
}
171+
```
172+
173+
- **Explanation**:
174+
- If both heaps are of equal size, the median is the average of their tops.
175+
- If they are unequal, the median is the top of `maxHeap` because it contains the middle element when the total number of elements is odd.
176+
177+
- **Example**:
178+
Heaps after adding `[2, 3, 4, 5]`:
179+
- `maxHeap = [3, 2]`, `minHeap = [4, 5]`.
180+
- Median = `(3 + 4) / 2 = 3.5`.
181+
182+
### **Step-by-Step Example**
183+
184+
Let’s walk through an example input: `[5, 15, 1, 3]`.
185+
186+
1. **Add `5`**:
187+
- `maxHeap = [5]`, `minHeap = []`.
188+
- Median = `5`.
189+
190+
2. **Add `15`**:
191+
- Add `15` to `minHeap`.
192+
- Heaps: `maxHeap = [5]`, `minHeap = [15]`.
193+
- Median = `(5 + 15) / 2 = 10`.
194+
195+
3. **Add `1`**:
196+
- Add `1` to `maxHeap`.
197+
- Balance the heaps: Move `5` to `minHeap`.
198+
- Heaps: `maxHeap = [3, 1]`, `minHeap = [5, 15]`.
199+
- Median = `5`.
200+
201+
4. **Add `3`**:
202+
- Add `3` to `maxHeap`.
203+
- Balance the heaps: Move `5` to `maxHeap`.
204+
- Heaps: `maxHeap = [5, 3, 1]`, `minHeap = [15]`.
205+
- Median = `(3 + 5) / 2 = 4`.
206+
207+
208+
209+
### **Time and Space Complexity**
210+
211+
#### **Time Complexity**
212+
1. **Insertion**:
213+
- Pushing into a heap takes \(O(\log n)\).
214+
- Balancing the heaps involves \(O(\log n)\).
215+
- Total per insertion: \(O(\log n)\).
216+
217+
2. **Median Calculation**:
218+
- Fetching the top of the heaps takes \(O(1)\).
219+
- Total: \(O(1)\).
220+
221+
For \(n\) numbers:
222+
- **Total Time**: \(O(n \log n)\).
223+
224+
225+
226+
#### **Space Complexity**
227+
- Two heaps store \(O(n)\) elements combined.
228+
- **Total Space**: \(O(n)\).
229+
230+
231+
232+
### **Additional Recommendations**
233+
1. **Understand the Two-Heap Approach**: Visualize the heaps to understand how the numbers are distributed and balanced.
234+
2. **Edge Cases**: Test inputs like:
235+
- A single number.
236+
- A stream of equal numbers.
237+
3. **Practice Similar Problems**: Try variations like "Sliding Window Median" to reinforce your knowledge.
238+
239+
This line-by-line explanation should make the logic and flow of the code clear. Happy learning! 😊
240+
241+
242+
243+
### **Step 5: Additional Recommendations**
244+
1. **Understand the Two-Heap Approach**: Practice using max-heap and min-heap together for solving median problems. This approach is efficient and widely applicable.
245+
2. **Handle Edge Cases**: Test scenarios like:
246+
- A single element.
247+
- A stream with only increasing or decreasing numbers.
248+
3. **Practice Similar Problems**: Try problems like "Sliding Window Median" to enhance understanding of heap-based solutions.
249+
250+
By following this explanation, you can solve and debug similar problems effectively. Happy coding! 😊

0 commit comments

Comments
 (0)