Skip to content

Commit 40b083a

Browse files
committed
refactor(algorithms, fast-and-slow): find duplicate number
1 parent f95eb2d commit 40b083a

29 files changed

+242
-154
lines changed

algorithms/fast_and_slow/find_duplicate/README.md

Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,3 +37,178 @@ Explanation 3:
3737
```
3838

3939
> Note: You cannot modify the given array nums. You have to solve the problem using only constant extra space.
40+
41+
## Solution
42+
43+
This solution involves two key steps: identifying the cycle and locating the entry point of this identified cycle, which
44+
represents the duplicate number in the array.
45+
46+
The fast and slow pointers technique detects such cycles efficiently, where one pointer takes one step at a time and the
47+
other advances by two steps. Initially pointing at the start of the array, the position of each pointer for the next step
48+
is determined by the current value they are pointing to. If a pointer points to the value 5 at index 0, its next position
49+
will be index 5. As the pointers traverse the array, both will eventually catch up because there’s a cycle. This is
50+
because, in the noncyclic part, the distance between the pointers increases by one index in each iteration. However, once
51+
both pointers enter the cyclic part, the fast pointer starts closing the gap on the slow pointer, decreasing the distance
52+
by one index each iteration until they meet.
53+
54+
Once the duplicate number is confirmed, we reset one of the pointers (usually the slow pointer) to index 0 while the
55+
other stays at the position where the pointers met. Then, both pointers move one step at a time until they meet again.
56+
With this positioning and pace of pointers, pointers are guaranteed to meet at the cycle’s starting point, corresponding
57+
to the duplicate number.
58+
59+
Now, let’s look at the detailed workflow of the solution:
60+
61+
For this problem, the duplicate number will create a cycle in the nums array. The cycle in the nums array helps identify
62+
the duplicate number.
63+
64+
To find the cycle, we’ll move in the nums array using the `f(x)=nums[x]`, where x is the index of the array. This
65+
function constructs the following sequence to move:
66+
67+
x, nums[x], nums[nums[x]], nums[nums[nums[x]]], ...
68+
69+
In the sequence above, every new element is an element in nums present at the index of the previous element.
70+
71+
Let’s say we have an array,[2, 3, 1, 3]. We’ll start with `x=nums[0]`, which is 2, present at the 0th index of the array
72+
and then move to nums[x], which is 1, present at the 2nd index. Since we found 1 at the 2nd index, we’ll move to the 1st,
73+
and so on. This example shows that if we’re given an array of length n+1, with values in the range [1, n], we can use this
74+
traversal technique to visit all the locations in the array.
75+
76+
The following example illustrates this traversal:
77+
78+
![Solution 1](./images/solutions/find_duplicate_solution_1.png)
79+
![Solution 2](./images/solutions/find_duplicate_solution_2.png)
80+
![Solution 3](./images/solutions/find_duplicate_solution_3.png)
81+
![Solution 4](./images/solutions/find_duplicate_solution_4.png)
82+
![Solution 5](./images/solutions/find_duplicate_solution_5.png)
83+
![Solution 6](./images/solutions/find_duplicate_solution_6.png)
84+
![Solution 7](./images/solutions/find_duplicate_solution_7.png)
85+
![Solution 8](./images/solutions/find_duplicate_solution_8.png)
86+
![Solution 9](./images/solutions/find_duplicate_solution_9.png)
87+
![Solution 10](./images/solutions/find_duplicate_solution_10.png)
88+
![Solution 11](./images/solutions/find_duplicate_solution_11.png)
89+
90+
Now, let's dive deep into how our two parts of the solution work.
91+
92+
In the first part, the slow pointer moves once, while the fast pointer moves twice as fast as the slow pointer until
93+
both of the pointers meet each other. Since the fast pointer is moving twice as fast as the slow pointer, it will be the
94+
first one to enter and move around the cycle. At some point after the slow pointer also enters and moves in the cycle,
95+
the fast pointer will meet the slow pointer. This will be the intersection point.
96+
97+
> Note: The intersection point of the two pointers is, in the general case, not the entry point of the cycle.
98+
99+
![Solution 12](./images/solutions/find_duplicate_solution_12.png)
100+
![Solution 13](./images/solutions/find_duplicate_solution_13.png)
101+
![Solution 14](./images/solutions/find_duplicate_solution_14.png)
102+
![Solution 15](./images/solutions/find_duplicate_solution_15.png)
103+
![Solution 16](./images/solutions/find_duplicate_solution_16.png)
104+
![Solution 17](./images/solutions/find_duplicate_solution_17.png)
105+
![Solution 18](./images/solutions/find_duplicate_solution_18.png)
106+
![Solution 19](./images/solutions/find_duplicate_solution_19.png)
107+
![Solution 20](./images/solutions/find_duplicate_solution_20.png)
108+
109+
In part two, we’ll start moving again in the cycle, but this time, we’ll slow down the fast pointer so that it moves
110+
with the same speed as the slow pointer.
111+
112+
Let’s look at the journeys of the two pointers in part two:
113+
114+
- The slow pointer will start from the 0th position.
115+
- The fast pointer will start from the intersection point.
116+
- After a certain number of steps, let’s call it F, the slow pointer meets the fast pointer. This is the ending point
117+
for both pointers.
118+
- This common ending position will be the entry point of the cycle.
119+
120+
Let’s look at the visual presentation of the second part of our solution:
121+
122+
![Solution 21](./images/solutions/find_duplicate_solution_21.png)
123+
![Solution 22](./images/solutions/find_duplicate_solution_22.png)
124+
![Solution 23](./images/solutions/find_duplicate_solution_23.png)
125+
126+
Now, let’s try to understand how it is that our solution is able to always locate the entry point of the cycle.
127+
128+
Let’s return to the example we just discussed, using this graphical representation:
129+
130+
![Solution 24](./images/solutions/find_duplicate_solution_24.png)
131+
132+
- 7 is the intersection point where the slow and fast pointers will meet.
133+
- 8 is the entry point of the cycle, which is our duplicate number.
134+
135+
The fast pointer is traversing two times faster than the slow pointer. This can be represented by the following equation:
136+
137+
> dfast = 2dslow ———(1)
138+
139+
Here, d represents the number of elements traversed.
140+
141+
Let’s look at the following diagram to see the steps taken by the slow and fast pointers from the starting point to the
142+
intersection point:
143+
144+
![Solution 25](./images/solutions/find_duplicate_solution_25.png)
145+
146+
> A list with a cycle
147+
148+
In the diagram above:
149+
150+
- Green represents the entry point of the cycle.
151+
- Blue represents the intersection point.
152+
- Yellow represents the starting point.
153+
- F represents the steps taken from the starting point to the entry point.
154+
- a represents the steps taken to reach the intersection point from the entry point.
155+
- C represents the cycle length, in terms of the number of steps taken to go once around the cycle.
156+
157+
With this setup in mind, let’s see the distance traveled by the slow and fast pointers. The slow pointer travels F steps
158+
from the starting point to the entry point of the cycle and then takes a steps from the entry point to the intersection
159+
point of the cycle, that is, the point where both pointers intersect. So, we can express the distance traveled by the
160+
slow pointer in the form of this equation:
161+
162+
dslow = F+a ——— (2)
163+
164+
In the time it takes the slow pointer to travel F+a steps, the fast pointer, since it’s traveling twice as fast as the
165+
slow pointer, will have also traveled around the cycle at least once. So, we can say the fast pointer, first, travels F
166+
steps from the starting point to the entry point of the cycle, then travels at least a cycle, and at the end travels a
167+
steps from the entry point to the intersection point of the cycle. Now, we can express the distance traveled by the fast
168+
pointer as the following equation:
169+
170+
dfast = F+C+a ——— (3)
171+
172+
Recall eq. (1):
173+
174+
dfast = 2dslow ——— (1)
175+
176+
If we substitute the equivalent expression of dslow given in eq. (2) and the equivalent expression of dfast given in eq.
177+
(3) into eq. (1), we get:
178+
179+
F + C + a = 2(F + a)
180+
181+
Let’s simplify this equation:
182+
183+
F + C + a = 2F + 2a
184+
C = F + a
185+
186+
Therefore, the distance from the starting point to the intersection point, F+a, equals C.
187+
188+
We can also re-arrange this equality as follows:
189+
190+
F=C−a
191+
192+
Let’s consult our diagram again:
193+
194+
![Solution 26](./images/solutions/find_duplicate_solution_26.png)
195+
196+
As we can see, C−a is, in fact, the distance from the intersection point back to the entry point. This illustrates why,
197+
when we move one pointer forward, starting at the intersection point, and another pointer from the starting point, the
198+
point where they meet is the entry point of the cycle.
199+
200+
> Note: The proof above does not consider the case where F is longer than the length of the cycle. In this situation,
201+
> it’s possible that the fast pointer will go around the cycle more than once. To express this more general case, we can
202+
> say that the distance covered by the fast pointer from the entry point to the intersection point is: F+nC+a, where n
203+
> is a positive integer. As a result, our substitution will take this form: F + nC + a = 2(F+a), which simplifies to
204+
> nC=F+a, that is: F = nC − a. This simply means that after going around the cycle n times, the fast pointer will still
205+
> be a steps behind the entry point of the cycle.
206+
207+
### Time Complexity
208+
209+
The time complexity of the algorithm is O(n), where n is the length of nums. This is because, in each part of the
210+
solution, the slow pointer traverses nums just once.
211+
212+
### Space Complexity
213+
214+
The algorithm takes O(1) space complexity, since we only used constant space to store the fast and slow pointers.

algorithms/fast_and_slow/find_duplicate/__init__.py

Lines changed: 26 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -26,20 +26,38 @@ def find_duplicate_floyd_algo(numbers: List[int]) -> int:
2626

2727
if len(numbers) <= 1:
2828
return -1
29-
30-
slow = numbers[0]
31-
fast = numbers[numbers[0]]
32-
33-
while slow != fast:
29+
# Initialize the fast and slow pointers and make them point the first
30+
# element of the array
31+
slow = fast = numbers[0]
32+
33+
# PART #1
34+
# Traverse in array until the intersection point is found
35+
while True:
36+
# Move the slow pointer using the nums[slow] flow
3437
slow = numbers[slow]
38+
# Move the fast pointer two times fast as the slow pointer using the
39+
# nums[nums[fast]] flow
3540
fast = numbers[numbers[fast]]
36-
37-
fast = 0
41+
# Break the loop when slow pointer becomes equal to the fast pointer, i.e.,
42+
# if the intersection is found
43+
if slow == fast:
44+
break
45+
46+
# PART #2
47+
# Make the slow pointer point the starting position of an array again, i.e.,
48+
# start the slow pointer from starting position
49+
slow = numbers[0]
50+
# Traverse in the array until the slow pointer becomes equal to the
51+
# fast pointer
3852
while fast != slow:
53+
# Move the slow pointer using the nums[slow] flow
3954
slow = numbers[slow]
55+
# Move the fast pointer slower than before, i.e., move the fast pointer
56+
# using the nums[fast] flow
4057
fast = numbers[fast]
4158

42-
return slow
59+
# Return the fast pointer as it points the duplicate number of the array
60+
return fast
4361

4462

4563
def find_duplicate(numbers: List[int]) -> int:
45.9 KB
Loading
112 KB
Loading
119 KB
Loading
69 KB
Loading
102 KB
Loading
72.5 KB
Loading
70.5 KB
Loading
71.9 KB
Loading

0 commit comments

Comments
 (0)