|
| 1 | +# Data Stream as Disjoint Intervals |
| 2 | + |
| 3 | +You are given a stream of non-negative integers a1, a2,... ,an . At any point, you need to summarize all numbers seen so |
| 4 | +far as a list of disjoint intervals. |
| 5 | + |
| 6 | +Your task is to implement the Summary Ranges class, where: |
| 7 | + |
| 8 | +1. Constructor: Initializes the Summary Ranges object with an empty stream. |
| 9 | +2. Add Num(int value): Adds the integer value to the stream. |
| 10 | +3. Get Intervals(): Returns the current summary of numbers as a list of disjoint intervals [start_i, end_i], sorted by start_i. |
| 11 | + |
| 12 | +> Note: Each number belongs to exactly one interval. Intervals must merge whenever new numbers connect or extend existing |
| 13 | +> ones, and duplicate insertions should not affect the summary. |
| 14 | +
|
| 15 | +## Constraints |
| 16 | + |
| 17 | +- 0 <= value <= 10^4 |
| 18 | +- At most 3*10^4 calls will be made to addNum and getIntervals. |
| 19 | +- At most 10^2 calls will be made to getIntervals. |
| 20 | + |
| 21 | +## Examples |
| 22 | + |
| 23 | + |
| 24 | + |
| 25 | + |
| 26 | + |
| 27 | + |
| 28 | +## Solution |
| 29 | + |
| 30 | +When numbers arrive one after another in a stream, it’s easy to imagine them as scattered pebbles landing on a number |
| 31 | +line. If we keep them as-is, the picture quickly becomes messy. Instead, we want to summarize what we’ve seen into clean |
| 32 | +stretches of consecutive values, i.e., intervals. The challenge is that every new number can behave differently: |
| 33 | +- It may fall inside an existing interval. |
| 34 | +- It may extend an interval by one. |
| 35 | +- It may even act like a missing puzzle piece that connects two intervals into one larger block. |
| 36 | + |
| 37 | +This constant merging and organizing is why the “intervals” pattern is the right fit. Rather than storing every number, |
| 38 | +we maintain only the boundaries of disjoint intervals and carefully update them when new values arrive. This way, our |
| 39 | +summary stays compact, sorted, and easy to return. |
| 40 | + |
| 41 | +We start by keeping a sorted collection of intervals instead of recording every number one by one. Each new value starts |
| 42 | +as a small single-point interval, and then we look at the ranges around it to decide how it fits. |
| 43 | + |
| 44 | +- If an existing range already covers the value, we simply ignore it. |
| 45 | +- If the value lies just beyond the end of a range, we extend that range to include it. |
| 46 | +- If the value lies just before the start of a range, we merge it with that range. |
| 47 | +- If the value sits exactly between two ranges, we merge them into one larger range. |
| 48 | + |
| 49 | + |
| 50 | + |
| 51 | + |
| 52 | + |
| 53 | + |
| 54 | +If none of the above cases apply, the number remains in a new interval. The stored intervals are sorted and disjointed |
| 55 | +at any time, so generating the summary is as simple as listing them in order. |
| 56 | + |
| 57 | +The following steps can be performed to implement the algorithm above: |
| 58 | + |
| 59 | +1. We keep the data as a sorted map called intervals, where in this map, the key is the start of an interval and the |
| 60 | + value is the end of that interval. This ensures intervals are always ordered by their start and remain disjoint. |
| 61 | +2. Constructor: We initialize intervals as an empty map in the constructor. |
| 62 | +3. Add Num(int value): Adding a number to the stream. |
| 63 | + - We treat the new number value as a small interval by setting newStart = value and newEnd = value. This will be our |
| 64 | + candidate interval that may expand or merge. |
| 65 | + - Then, we work on finding the two neighbors around this number: |
| 66 | + - nextInterval, which is the first interval whose start is greater than value. |
| 67 | + - prevInterval, which is the interval immediately before nextInterval, if one exists. |
| 68 | + - Check the previous interval: |
| 69 | + - If prevInterval->end (the end of the previous interval) is greater than or equal to the value, then the number |
| 70 | + is already covered inside that range. In this case, we simply return without any changes. |
| 71 | + - If prevInterval->second equals value - 1, then the new number touches the end of the previous interval. |
| 72 | + - In this case, we extend the candidate’s start (newStart) to prevInterval->first so that the candidate also |
| 73 | + includes the previous interval. |
| 74 | + - Check the next interval: |
| 75 | + - If nextInterval->start (the start of the next interval) equals value + 1, then the new number touches the start |
| 76 | + of the next interval. |
| 77 | + - We extend the candidate’s end (newEnd) to nextInterval->second and remove nextInterval from the map, since it |
| 78 | + will be merged. |
| 79 | + - If the previous and next conditions apply, the candidate bridges them into one larger interval. |
| 80 | + - Finally, we insert the merged interval into the map as intervals[newStart] = newEnd. |
| 81 | + - This overwrites the previous interval if it was extended. |
| 82 | + - It replaces the next interval if it was merged. |
| 83 | + - Or it creates a new single-point interval if no merges happened. |
| 84 | +4. Get Intervals(): Getting all intervals. |
| 85 | + - We create an empty result list. |
| 86 | + - Then we iterate through all the entries in intervals, where each interval is interval.first as the start and |
| 87 | + interval.second as the end. |
| 88 | + - For each entry, we push [interval.first, interval.second] into the result list. |
| 89 | + - Finally, we return the result list, and as intervals is always maintained, sorted, and disjointed, the result |
| 90 | + requires no further processing. |
| 91 | + |
| 92 | +Let’s look at the following illustration to get a better understanding of the solution: |
| 93 | + |
| 94 | + |
| 95 | + |
| 96 | + |
| 97 | + |
| 98 | + |
| 99 | + |
| 100 | + |
| 101 | + |
| 102 | + |
| 103 | +### Time Complexity |
| 104 | + |
| 105 | +Let k be the current number of disjoint intervals stored in the intervals map. |
| 106 | + |
| 107 | +- Add Num(int value): O(logk): |
| 108 | + - One upper bound O(logk), at most one predecessor check O(1), and up to one erase (of the next interval) plus one |
| 109 | + insert/update (each O(logk). |
| 110 | + |
| 111 | +- Get Intervals(): O(k) |
| 112 | + - We iterate over every stored interval once to build the output. |
| 113 | + |
| 114 | +- Worst case relation to n: If there are n Add Num(int value) calls and nothing ever merges, then k=O(n), giving |
| 115 | + Add Num(int value) O(logn) and Get Intervals() O(n). |
| 116 | + |
| 117 | +### Space Complexity |
| 118 | + |
| 119 | +As we store only interval boundaries (start → end) rather than every number seen, the space complexity is O(k). In the |
| 120 | +worst case with no merges, k=O(n), so space becomes O(n). |
0 commit comments