Skip to content

Commit 9a3ea30

Browse files
committed
feat(algorithms, heaps): top k closest point to origin
1 parent 0f36bfc commit 9a3ea30

16 files changed

+179
-0
lines changed
Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# K Closest Points to Origin
2+
3+
Given a list of points in the form [[x1, y1], [x2, y2], ... [xn, yn]] and an integer k, find the k closest points to the
4+
origin (0, 0) on the 2D plane.
5+
6+
The distance between two points (x, y) and (a, b) is calculated using the formula:
7+
8+
√(x1 - a2)2 + (y1 - b2)2
9+
10+
Return the k closest points in any order.
11+
12+
## Examples
13+
14+
```text
15+
Input:
16+
17+
points = [[3,4],[2,2],[1,1],[0,0],[5,5]]
18+
k = 3
19+
20+
Output:
21+
[[2,2],[1,1],[0,0]]
22+
23+
Also Valid:
24+
25+
[[2,2],[0,0],[1,1]]
26+
[[1,1],[0,0],[2,2]]
27+
[[1,1],[2,2],[0,0]]
28+
...
29+
[[0,0],[1,1],[2,2]]
30+
```
31+
32+
![Example 1](./images/examples/top_k_closest_to_origin_example_1.png)
33+
34+
```text
35+
Input: points = [[3,3],[5,-1],[-2,4]], k = 2
36+
Output: [[3,3],[-2,4]]
37+
Explanation: The answer [[-2,4],[3,3]] would also be accepted.
38+
```
39+
40+
```text
41+
Input: points = [[1,3],[-2,2]], k = 1
42+
Output: [[-2,2]]
43+
44+
Explanation:
45+
The distance between (1, 3) and the origin is sqrt(10).
46+
The distance between (-2, 2) and the origin is sqrt(8).
47+
Since sqrt(8) < sqrt(10), (-2, 2) is closer to the origin.
48+
We only want the closest k = 1 points from the origin, so the answer is just [[-2,2]].
49+
```
50+
51+
## Solution
52+
53+
- [Approach 1](#approach-1-sorting)
54+
- [Approach 2](#approach-2-max-heap)
55+
56+
### Approach 1: Sorting
57+
58+
The simplest approach is to sort calculate the distance of each point from the origin and sort the points based on their
59+
distance. This approach has a time complexity of O(n log n) where n is the number of points in the array, and a space
60+
complexity of O(n) (to store the sorted array of distances).
61+
62+
### Approach 2: Max Heap
63+
64+
This problem can be solved using a similar approach to the one used to solve [Kth Largest Element in an Array](../topklargest/README.md). The key
65+
difference is that we need to store the k closest points to the origin, rather than the k largest elements. Since we are
66+
looking for the k smallest elements, we need a max-heap, rather than a min-heap.
67+
68+
By default, python's heapq module implements a min-heap, but we can make it behave like a max-heap by negating the values
69+
of everything we push onto it.
70+
71+
We add the first k points to the heap by pushing a tuple containing the negative of the distance from the origin, and the
72+
index of the point. After that is finished, our heap contains the k closest points to the origin that we've seen so far,
73+
with the point furthest from the origin at the root of the heap.
74+
75+
For each point after the first k, we calculate the distance from the origin and compare it with the root of the heap. If
76+
the current point is closer to the origin than the root of the heap, we pop the root and push the current point into the
77+
heap. This way, the heap will always contain the k closest points to the origin we've seen so far.
78+
79+
At the end of the iteration, the heap will contain the k closest points to the origin. We can iterate over each point in
80+
the heap and return the point associated with each tuple.
81+
82+
![Solution 0](./images/solutions/top_k_closest_to_origin_solution_0.png)
83+
![Solution 1](./images/solutions/top_k_closest_to_origin_solution_1.png)
84+
![Solution 2](./images/solutions/top_k_closest_to_origin_solution_2.png)
85+
![Solution 3](./images/solutions/top_k_closest_to_origin_solution_3.png)
86+
![Solution 4](./images/solutions/top_k_closest_to_origin_solution_4.png)
87+
![Solution 5](./images/solutions/top_k_closest_to_origin_solution_5.png)
88+
![Solution 6](./images/solutions/top_k_closest_to_origin_solution_6.png)
89+
![Solution 7](./images/solutions/top_k_closest_to_origin_solution_7.png)
90+
![Solution 8](./images/solutions/top_k_closest_to_origin_solution_8.png)
91+
![Solution 9](./images/solutions/top_k_closest_to_origin_solution_9.png)
92+
![Solution 10](./images/solutions/top_k_closest_to_origin_solution_10.png)
93+
![Solution 11](./images/solutions/top_k_closest_to_origin_solution_11.png)
94+
95+
#### Complexity Analysis
96+
97+
##### Time Complexity: O(n log k)
98+
99+
Where n is the number of points in the array and k is the input parameter. We iterate over
100+
all points, and in the worst case, we both push and pop each point from the heap, which takes O(log k) time per point.
101+
102+
##### Space Complexity: O(k)
103+
104+
Where k is the input parameter. The space is used by the heap to store the k closest points to the origin.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
from typing import List, Tuple
2+
import heapq
3+
4+
5+
def k_closest_to_origin(points: List[List[int]], k: int) -> List[List[int]]:
6+
# Max heap will store the top k points closest to the origin in the form (-distance, idx). The distance is negated
7+
# because in Python, the heapq module uses min-heap by default. Negating values gives us a maximum heap.
8+
# We store the idx of the point for later retrieval from the points array passed in
9+
max_heap: List[Tuple[int, int]] = []
10+
11+
for idx, point in enumerate(points):
12+
x, y = point
13+
# calculate the distance for this point from the origin
14+
distance = x * x + y * y
15+
16+
# If the contents of the heap are less than the desired top k, then we add the current point's distance and idx
17+
if len(max_heap) < k:
18+
heapq.heappush(max_heap, (-distance, idx))
19+
# We check if the calculated distance of this point is less than the top element in the heap. If this point
20+
# is closer to the origin that what is at the root of the heap, we pop the root of the heap and add this
21+
# new distance and index.
22+
# Note the negation here again to get the actual distance
23+
elif distance < -max_heap[0][0]:
24+
heapq.heappushpop(max_heap, (-distance, idx))
25+
26+
# Return the top k points closest to origin. We use 1 to get the index of the point from the original points list
27+
# as that is what is stored in the max heap
28+
return [points[point[1]] for point in max_heap]
29+
30+
31+
def k_closest_to_origin_sorting(points: List[List[int]], k: int) -> List[List[int]]:
32+
# Sort the points by the distance from the origin. This incurs a cost of O(n log(n)) and space cost of O(n) due to
33+
# timsort
34+
points.sort(key=lambda p: p[0] ** 2 + p[1] ** 2)
35+
# Retrieve the top k points closest to the origin
36+
return points[:k]
41.3 KB
Loading
34.6 KB
Loading
31.5 KB
Loading
55.2 KB
Loading
30.2 KB
Loading
36.1 KB
Loading
49.8 KB
Loading
45.1 KB
Loading

0 commit comments

Comments
 (0)