Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions DIRECTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,10 @@
* Heap
* Schedule Tasks
* [Test Schedule Tasks On Minimum Machines](https://github.com/BrianLusina/PythonSnips/blob/master/algorithms/heap/schedule_tasks/test_schedule_tasks_on_minimum_machines.py)
* Topkclosesttoorigin
* [Test Top K Closest To Origin](https://github.com/BrianLusina/PythonSnips/blob/master/algorithms/heap/topkclosesttoorigin/test_top_k_closest_to_origin.py)
* Topklargest
* [Test Top K Largest Elements](https://github.com/BrianLusina/PythonSnips/blob/master/algorithms/heap/topklargest/test_top_k_largest_elements.py)
* Huffman
* [Decoding](https://github.com/BrianLusina/PythonSnips/blob/master/algorithms/huffman/decoding.py)
* [Encoding](https://github.com/BrianLusina/PythonSnips/blob/master/algorithms/huffman/encoding.py)
Expand Down
104 changes: 104 additions & 0 deletions algorithms/heap/topkclosesttoorigin/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# K Closest Points to Origin

Given a list of points in the form [[x1, y1], [x2, y2], ... [xn, yn]] and an integer k, find the k closest points to the
origin (0, 0) on the 2D plane.

The distance between two points (x, y) and (a, b) is calculated using the formula:

√(x1 - a2)2 + (y1 - b2)2

Return the k closest points in any order.

## Examples

```text
Input:

points = [[3,4],[2,2],[1,1],[0,0],[5,5]]
k = 3

Output:
[[2,2],[1,1],[0,0]]

Also Valid:

[[2,2],[0,0],[1,1]]
[[1,1],[0,0],[2,2]]
[[1,1],[2,2],[0,0]]
...
[[0,0],[1,1],[2,2]]
```

![Example 1](./images/examples/top_k_closest_to_origin_example_1.png)

```text
Input: points = [[3,3],[5,-1],[-2,4]], k = 2
Output: [[3,3],[-2,4]]
Explanation: The answer [[-2,4],[3,3]] would also be accepted.
```

```text
Input: points = [[1,3],[-2,2]], k = 1
Output: [[-2,2]]

Explanation:
The distance between (1, 3) and the origin is sqrt(10).
The distance between (-2, 2) and the origin is sqrt(8).
Since sqrt(8) < sqrt(10), (-2, 2) is closer to the origin.
We only want the closest k = 1 points from the origin, so the answer is just [[-2,2]].
```

## Solution

- [Approach 1](#approach-1-sorting)
- [Approach 2](#approach-2-max-heap)

### Approach 1: Sorting

The simplest approach is to sort calculate the distance of each point from the origin and sort the points based on their
distance. This approach has a time complexity of O(n log n) where n is the number of points in the array, and a space
complexity of O(n) (to store the sorted array of distances).

### Approach 2: Max Heap

This problem can be solved using a similar approach to the one used to solve [Kth Largest Element in an Array](../topklargest/README.md). The key
difference is that we need to store the k closest points to the origin, rather than the k largest elements. Since we are
looking for the k smallest elements, we need a max-heap, rather than a min-heap.

By default, python's heapq module implements a min-heap, but we can make it behave like a max-heap by negating the values
of everything we push onto it.

We add the first k points to the heap by pushing a tuple containing the negative of the distance from the origin, and the
index of the point. After that is finished, our heap contains the k closest points to the origin that we've seen so far,
with the point furthest from the origin at the root of the heap.

For each point after the first k, we calculate the distance from the origin and compare it with the root of the heap. If
the current point is closer to the origin than the root of the heap, we pop the root and push the current point into the
heap. This way, the heap will always contain the k closest points to the origin we've seen so far.

At the end of the iteration, the heap will contain the k closest points to the origin. We can iterate over each point in
the heap and return the point associated with each tuple.

![Solution 0](./images/solutions/top_k_closest_to_origin_solution_0.png)
![Solution 1](./images/solutions/top_k_closest_to_origin_solution_1.png)
![Solution 2](./images/solutions/top_k_closest_to_origin_solution_2.png)
![Solution 3](./images/solutions/top_k_closest_to_origin_solution_3.png)
![Solution 4](./images/solutions/top_k_closest_to_origin_solution_4.png)
![Solution 5](./images/solutions/top_k_closest_to_origin_solution_5.png)
![Solution 6](./images/solutions/top_k_closest_to_origin_solution_6.png)
![Solution 7](./images/solutions/top_k_closest_to_origin_solution_7.png)
![Solution 8](./images/solutions/top_k_closest_to_origin_solution_8.png)
![Solution 9](./images/solutions/top_k_closest_to_origin_solution_9.png)
![Solution 10](./images/solutions/top_k_closest_to_origin_solution_10.png)
![Solution 11](./images/solutions/top_k_closest_to_origin_solution_11.png)

#### Complexity Analysis

##### Time Complexity: O(n log k)

Where n is the number of points in the array and k is the input parameter. We iterate over
all points, and in the worst case, we both push and pop each point from the heap, which takes O(log k) time per point.

##### Space Complexity: O(k)

Where k is the input parameter. The space is used by the heap to store the k closest points to the origin.
36 changes: 36 additions & 0 deletions algorithms/heap/topkclosesttoorigin/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
from typing import List, Tuple
import heapq


def k_closest_to_origin(points: List[List[int]], k: int) -> List[List[int]]:
# Max heap will store the top k points closest to the origin in the form (-distance, idx). The distance is negated
# because in Python, the heapq module uses min-heap by default. Negating values gives us a maximum heap.
# We store the idx of the point for later retrieval from the points array passed in
max_heap: List[Tuple[int, int]] = []

for idx, point in enumerate(points):
x, y = point
# calculate the distance for this point from the origin
distance = x * x + y * y

# If the contents of the heap are less than the desired top k, then we add the current point's distance and idx
if len(max_heap) < k:
heapq.heappush(max_heap, (-distance, idx))
# We check if the calculated distance of this point is less than the top element in the heap. If this point
# is closer to the origin that what is at the root of the heap, we pop the root of the heap and add this
# new distance and index.
# Note the negation here again to get the actual distance
elif distance < -max_heap[0][0]:
heapq.heappushpop(max_heap, (-distance, idx))

# Return the top k points closest to origin. We use 1 to get the index of the point from the original points list
# as that is what is stored in the max heap
return [points[point[1]] for point in max_heap]


def k_closest_to_origin_sorting(points: List[List[int]], k: int) -> List[List[int]]:
# Sort the points by the distance from the origin. This incurs a cost of O(n log(n)) and space cost of O(n) due to
# timsort
sorted_points = sorted(points, key=lambda p: p[0] ** 2 + p[1] ** 2)
# Retrieve the top k points closest to the origin
return sorted_points[:k]
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import unittest
from typing import List
from parameterized import parameterized
from algorithms.heap.topkclosesttoorigin import (
k_closest_to_origin,
k_closest_to_origin_sorting,
)

TOP_K_CLOSEST_TO_ORIGIN = [
([[3, 4], [2, 2], [1, 1], [0, 0], [5, 5]], 3, [[2, 2], [1, 1], [0, 0]]),
([[1, 3], [-2, 2]], 1, [[-2, 2]]),
([[3, 3], [5, -1], [-2, 4]], 2, [[3, 3], [-2, 4]]),
]


class TopKClosestToOriginTestCase(unittest.TestCase):
@parameterized.expand(TOP_K_CLOSEST_TO_ORIGIN)
def test_top_k_closest_to_origin(
self, points: List[List[int]], k: int, expected: List[List[int]]
):
actual = k_closest_to_origin(points, k)

sorted_expected = sorted(expected, key=lambda x: (x[0], x[1]))
sorted_actual = sorted(actual, key=lambda x: (x[0], x[1]))
self.assertEqual(sorted_expected, sorted_actual)

@parameterized.expand(TOP_K_CLOSEST_TO_ORIGIN)
def test_top_k_closest_to_origin_sorting(
self, points: List[List[int]], k: int, expected: List[List[int]]
):
actual = k_closest_to_origin_sorting(points, k)

sorted_expected = sorted(expected, key=lambda x: x[0])
sorted_actual = sorted(actual, key=lambda x: x[0])
self.assertEqual(sorted_expected, sorted_actual)


if __name__ == "__main__":
unittest.main()
106 changes: 106 additions & 0 deletions algorithms/heap/topklargest/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# Top-K Largest Elements in an Array

Given an integer array nums, return the 3 largest elements in the array in any order.

## Example

```text
Input: nums = [9, 3, 7, 1, -2, 6, 8]
Output: [8, 7, 9]
# or [7, 9, 8] or [9, 7, 8] ...
```

## Solution

Here's how we can solve this problem using a min-heap:

- Create a min-heap that stores the first 3 elements of the array. These represent the 3 largest elements we have seen so
far, with the smallest of the 3 at the root of the heap.
![Solution 1](./images/solutions/top_k_largest_element_in_array_solution_1.png)

- Iterate through the remaining elements in the array.
- If the current element is larger than the root of the heap, pop the root and push the current element into the heap.
- Otherwise, continue to the next element.

![Solution 2](./images/solutions/top_k_largest_element_in_array_solution_2.png)
![Solution 3](./images/solutions/top_k_largest_element_in_array_solution_3.png)
![Solution 4](./images/solutions/top_k_largest_element_in_array_solution_4.png)
![Solution 5](./images/solutions/top_k_largest_element_in_array_solution_5.png)

After iterating through all the elements, the heap contains the 3 largest elements in the array.

### Complexity Analysis

#### Time Complexity Breakdown

- The heapify function takes O(3) = O(1) time
- We iterate through all elements in the array once: O(n) time
- The heappop and heappush operations take O(log 3) = O(1) time each

#### Space Complexity

- We use a heap of size 3 to store the 3 largest elements: O(3) = O(1) space
- The algorithm uses constant space regardless of input size

Note: The time and space complexity become more interesting when 3 is a variable number k.

---

# Kth Largest Element in an Array

Write a function that takes an array of unsorted integers nums and an integer k, and returns the kth largest element in
the array. This function should run in O(n log k) time, where n is the length of the array.

## Examples

```text
Input:
nums = [5, 3, 2, 1, 4]
k = 2

Output: 4
```

## Solutions

- [Approach 1](#approach-1-sorting)
- [Approach 2](#approach-2-min-heap)

### Approach 1: Sorting

The simplest approach is to sort the array in descending order and return the kth element. This approach has a time
complexity of O(n log n) where n is the number of elements in the array, and a space complexity of O(1).

### Approach 2: Min Heap

By using a min-heap, we can reduce the time complexity to O(n log k), where n is the number of elements in the array and
k is the value of k.
The idea behind this solution is to iterate over the elements in the array while storing the k largest elements we've
seen so far in a min-heap. At each element, we check if it is greater than the smallest element (the root) of the heap.
If it is, we pop the smallest element from the heap and push the current element into the heap. This way, the heap will
always contain the k largest elements we've seen so far.
After iterating over all the elements, the root of the heap will be the kth largest element in the array.

![Solution 2.1](./images/solutions/kth_largest_element_in_array_solution_1.png)
![Solution 2.2](./images/solutions/kth_largest_element_in_array_solution_2.png)
![Solution 2.3](./images/solutions/kth_largest_element_in_array_solution_3.png)
![Solution 2.4](./images/solutions/kth_largest_element_in_array_solution_4.png)
![Solution 2.5](./images/solutions/kth_largest_element_in_array_solution_5.png)
![Solution 2.6](./images/solutions/kth_largest_element_in_array_solution_6.png)
![Solution 2.7](./images/solutions/kth_largest_element_in_array_solution_7.png)
![Solution 2.8](./images/solutions/kth_largest_element_in_array_solution_8.png)
![Solution 2.9](./images/solutions/kth_largest_element_in_array_solution_9.png)
![Solution 2.10](./images/solutions/kth_largest_element_in_array_solution_10.png)
![Solution 2.11](./images/solutions/kth_largest_element_in_array_solution_11.png)

#### Complexity Analysis

##### Time Complexity: O(n log k)

Where n is the number of elements in the array and k is the input parameter. We iterate over
all elements, and in the worst case, we both push and pop each element from the heap, which takes O(log k) time per
element.

##### Space Complexity: O(k)

Where k is the input parameter. The space is used by the heap to store the k largest elements.
83 changes: 83 additions & 0 deletions algorithms/heap/topklargest/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
from typing import List
import heapq


def k_largest(nums: List[int], k: int = 3) -> List[int]:
"""
Finds the k largest elements in a given list. K is defaulted to three, but can be used to tweak the k largest
elements in the array
Args:
nums(list): list of elements to check for
k(int): number of elements to check, defaulted to 3
Returns:
list: top k largest elements
"""
# input validation to ensure we don't get unexpected results
if not nums or k <= 0:
return []

# Adjust k if it exceeds the length of nums
k = min(k, len(nums))

# create a minimum heap with the first k elements
min_heap = nums[:k]
heapq.heapify(min_heap)

# iterate through the remaining elements
for num in nums[k:]:
# if the current number is greater than the element at the top of the heap
if num > min_heap[0]:
# Remove it and add this element
heapq.heappushpop(min_heap, num)

# return the top k elements
return min_heap


def kth_largest(nums: List[int], k: int) -> int:
"""
Finds the kth largest element in a given list
Args:
nums(list): list of elements to check for
k(int): the kth largest element to return
Returns:
int: the kth largest element
"""
# input validation to ensure we don't get unexpected results
if not nums or k <= 0 or k > len(nums):
return -1

# create a minimum heap with the first k elements
min_heap = []

# iterate through the remaining elements
for num in nums:
if len(min_heap) < k:
heapq.heappush(min_heap, num)
# if the current number is greater than the element at the top of the heap
elif num > min_heap[0]:
# Remove it and add this element
heapq.heappushpop(min_heap, num)

# return the top kth element
return min_heap[0]


def kth_largest_sorting(nums: List[int], k: int) -> int:
"""
Finds the kth largest element in a given list using sorting
Args:
nums(list): list of elements to check for
k(int): the kth largest element to return
Returns:
int: the kth largest element
"""
# input validation to ensure we don't get unexpected results
if not nums or k <= 0 or k > len(nums):
return -1

# Sort the list which incurs a time complexity cost of O(n log(n)). Space complexity is O(n) due to creating
# a new sorted list
sorted_nums = sorted(nums, reverse=True)
# Return the kth largest element
return sorted_nums[k - 1]
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading