Skip to content

Commit 19f783a

Browse files
authored
Create README.md
1 parent 6ddeeb8 commit 19f783a

File tree

1 file changed

+227
-0
lines changed
  • 19 - Heap Data Structure Problems/17 - Find Median From Data Stream

1 file changed

+227
-0
lines changed
Lines changed: 227 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,227 @@
1+
<h1 align='center'>Find - Median - From - Data - Stream</h1>
2+
3+
## Problem Statement
4+
5+
**Problem URL :** [Find Median From Data Stream](https://leetcode.com/problems/find-median-from-data-stream/)
6+
7+
![image](https://github.com/user-attachments/assets/82528e9c-86b6-4595-86a2-bab708022e8d)
8+
![image](https://github.com/user-attachments/assets/1efa30c8-37a1-4e2d-a641-2f57ad275661)
9+
10+
## Problem Explanation
11+
The task is to continuously find the median of a stream of integers.
12+
13+
- **Median Definition**: The median of a sorted dataset is:
14+
- The middle value if the dataset has an odd number of elements.
15+
- The average of the two middle values if the dataset has an even number of elements.
16+
- **Constraints**:
17+
- We cannot store all the numbers and sort them after every insertion, as it would be inefficient for large data streams.
18+
19+
#### **Example**
20+
1. Numbers: `[5]`
21+
Median: `5` (only one number).
22+
2. Numbers: `[5, 10]`
23+
Median: `(5 + 10) / 2 = 7.5`.
24+
3. Numbers: `[5, 10, 3]`
25+
Sorted: `[3, 5, 10]`
26+
Median: `5`.
27+
28+
29+
30+
#### **Approach**
31+
We use two heaps:
32+
1. **Max-Heap** (`maxHeap`): Stores the smaller half of the numbers. The maximum number of this half is the top of the heap.
33+
2. **Min-Heap** (`minHeap`): Stores the larger half of the numbers. The minimum number of this half is the top of the heap.
34+
35+
**Steps**:
36+
1. Insert the number into one of the heaps:
37+
- If the number is smaller than or equal to the top of `maxHeap`, add it to `maxHeap`.
38+
- Otherwise, add it to `minHeap`.
39+
2. Balance the heaps:
40+
- Ensure that the difference in sizes between `maxHeap` and `minHeap` is at most 1.
41+
- If `maxHeap` has too many elements, move the top of `maxHeap` to `minHeap`.
42+
- If `minHeap` has too many elements, move the top of `minHeap` to `maxHeap`.
43+
3. Find the median:
44+
- If the heaps are of equal size, the median is the average of the tops of both heaps.
45+
- If one heap has more elements, the median is the top of that heap.
46+
47+
## Problem Solution
48+
```cpp
49+
class MedianFinder {
50+
public:
51+
priority_queue<int> maxHeap;
52+
priority_queue<int, vector<int>, greater<int>> minHeap;
53+
MedianFinder() {
54+
55+
}
56+
57+
void addNum(int num) {
58+
if(maxHeap.size() == 0 || num <= maxHeap.top()) maxHeap.push(num);
59+
else minHeap.push(num);
60+
61+
if(maxHeap.size() > minHeap.size() + 1){
62+
minHeap.push(maxHeap.top());
63+
maxHeap.pop();
64+
}else if (minHeap.size() > maxHeap.size()){
65+
maxHeap.push(minHeap.top());
66+
minHeap.pop();
67+
}
68+
}
69+
70+
double findMedian() {
71+
if(minHeap.size() == maxHeap.size()) return (maxHeap.top() + minHeap.top()) / 2.0;
72+
else return maxHeap.top();
73+
}
74+
};
75+
76+
/**
77+
* Your MedianFinder object will be instantiated and called as such:
78+
* MedianFinder* obj = new MedianFinder();
79+
* obj->addNum(num);
80+
* double param_2 = obj->findMedian();
81+
*/
82+
```
83+
84+
## Problem Solution Explanation
85+
86+
```cpp
87+
class MedianFinder {
88+
public:
89+
priority_queue<int> maxHeap; // Max-heap for the smaller half
90+
priority_queue<int, vector<int>, greater<int>> minHeap; // Min-heap for the larger half
91+
```
92+
93+
- **Explanation**:
94+
- `maxHeap` keeps the smaller numbers, and the largest of these numbers is the top.
95+
- `minHeap` keeps the larger numbers, and the smallest of these numbers is the top.
96+
97+
98+
99+
```cpp
100+
MedianFinder() { }
101+
```
102+
103+
- **Explanation**:
104+
- Constructor to initialize the `MedianFinder` object.
105+
- No specific setup is required here.
106+
107+
108+
109+
```cpp
110+
void addNum(int num) {
111+
if(maxHeap.size() == 0 || num <= maxHeap.top())
112+
maxHeap.push(num);
113+
else
114+
minHeap.push(num);
115+
```
116+
117+
- **Explanation**:
118+
- If `maxHeap` is empty or `num` is less than or equal to the largest number in `maxHeap`, add it to `maxHeap`.
119+
- Otherwise, add it to `minHeap`.
120+
121+
- **Example**:
122+
Adding `10` when both heaps are empty:
123+
- `maxHeap = [10]`, `minHeap = []`.
124+
125+
Adding `15`:
126+
- `maxHeap.top() = 10`. Since `15 > 10`, add it to `minHeap`.
127+
- Result: `maxHeap = [10]`, `minHeap = [15]`.
128+
129+
130+
131+
```cpp
132+
if(maxHeap.size() > minHeap.size() + 1){
133+
minHeap.push(maxHeap.top());
134+
maxHeap.pop();
135+
}else if (minHeap.size() > maxHeap.size()){
136+
maxHeap.push(minHeap.top());
137+
minHeap.pop();
138+
}
139+
}
140+
```
141+
142+
- **Explanation**:
143+
- If `maxHeap` has more than one extra element, move its top to `minHeap`.
144+
- If `minHeap` has more elements, move its top to `maxHeap`.
145+
146+
- **Example**:
147+
Adding `5` to `maxHeap = [10]`, `minHeap = [15]`:
148+
- `maxHeap = [10, 5]`, `minHeap = [15]` (unbalanced).
149+
- Move `10` to `minHeap`.
150+
- Result: `maxHeap = [5]`, `minHeap = [10, 15]`.
151+
152+
153+
154+
```cpp
155+
double findMedian() {
156+
if(minHeap.size() == maxHeap.size())
157+
return (maxHeap.top() + minHeap.top()) / 2.0;
158+
else
159+
return maxHeap.top();
160+
}
161+
};
162+
```
163+
164+
- **Explanation**:
165+
- If the sizes of `maxHeap` and `minHeap` are equal, return the average of their tops.
166+
- Otherwise, return the top of `maxHeap` (it will always have more elements if sizes are unequal).
167+
168+
169+
170+
### **Step 3: Examples and Expected Outputs**
171+
172+
#### Example Input: `[5, 15, 1, 3]`
173+
1. Add `5`:
174+
- `maxHeap = [5]`, `minHeap = []`.
175+
- Median: `5`.
176+
177+
2. Add `15`:
178+
- `maxHeap = [5]`, `minHeap = [15]`.
179+
- Median: `(5 + 15) / 2 = 10`.
180+
181+
3. Add `1`:
182+
- `maxHeap = [5, 1]`, `minHeap = [15]` (unbalanced).
183+
- Move `5` to `minHeap`.
184+
- `maxHeap = [1]`, `minHeap = [5, 15]`.
185+
- Median: `5`.
186+
187+
4. Add `3`:
188+
- `maxHeap = [3, 1]`, `minHeap = [5, 15]` (balanced).
189+
- Median: `(3 + 5) / 2 = 4`.
190+
191+
192+
193+
### **Step 4: Time and Space Complexity**
194+
195+
#### **Time Complexity**
196+
1. **Insertion**:
197+
- Inserting into a heap takes \(O(\log n)\).
198+
- Balancing takes \(O(\log n)\).
199+
- Total per insertion: \(O(\log n)\).
200+
2. **Finding Median**:
201+
- Fetching the top of the heaps takes \(O(1)\).
202+
- Total: \(O(1)\).
203+
204+
For \(n\) numbers:
205+
- **Total Time Complexity**: \(O(n \log n)\).
206+
207+
208+
209+
#### **Space Complexity**
210+
- Both heaps together store all elements: \(O(n)\).
211+
- **Total Space Complexity**: \(O(n)\).
212+
213+
214+
215+
### **Step 5: Additional Recommendations**
216+
1. **Practice Similar Problems**:
217+
- "Sliding Window Median."
218+
- "Kth Largest Element in a Stream."
219+
2. **Edge Cases**:
220+
- Single number: `[5]`.
221+
- All numbers are the same: `[10, 10, 10]`.
222+
- Stream of negative numbers.
223+
224+
3. **Debugging Tip**:
225+
- Visualize the heaps after each insertion to understand their balance.
226+
227+
This detailed explanation ensures a beginner-friendly understanding of the problem and solution. 😊

0 commit comments

Comments
 (0)