Skip to content

Conversation

@BrianLusina
Copy link
Owner

@BrianLusina BrianLusina commented Jan 14, 2026

Describe your change:

  • Top k and kth largest
  • Add an algorithm?
  • Fix a bug or typo in an existing algorithm?
  • Documentation change?

Checklist:

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized.
  • I know that pull requests will not be merged if they fail the automated tests.
  • This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • All new Python files are placed inside an existing directory.
  • All filenames are in all lowercase characters with no spaces or dashes.
  • All functions and variable names follow Python naming conventions.
  • All function parameters and return values are annotated with Python type hints.
  • All functions have doctests that pass the automated testing.
  • All new algorithms have a URL in its comments that points to Wikipedia or other similar explanation.
  • If this pull request resolves one or more open issues then the commit message contains Fixes: #{$ISSUE_NO}.

Summary by CodeRabbit

  • New Features

    • Added utilities to compute top-K largest elements and the K points closest to the origin, with configurable k.
  • Documentation

    • Added detailed guides with examples, complexity notes, and illustrative diagrams for both problems.
  • Tests

    • Added comprehensive unit tests covering various scenarios, edge cases, and multiple implementations.

✏️ Tip: You can customize this high-level summary in your review settings.

@BrianLusina BrianLusina self-assigned this Jan 14, 2026
@BrianLusina BrianLusina added enhancement Algorithm Algorithm Problem Datastructures Datastructures Documentation Documentation Updates Sorting Contains sorting in the algorithm Array Array data structure Binary Tree Heap labels Jan 14, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 14, 2026

📝 Walkthrough

Walkthrough

Added two new heap algorithm features: Top K Largest (k_largest, kth_largest, kth_largest_sorting) and K Closest Points to Origin (heap and sorting implementations), plus documentation updates and unit tests for both features.

Changes

Cohort / File(s) Summary
Documentation Index
DIRECTORY.md
Inserted two entries under Heap/Schedule Tasks: Topkclosesttoorigin and Topklargest with links to their tests.
Top-K Largest (implementation & tests)
algorithms/heap/topklargest/__init__.py, algorithms/heap/topklargest/README.md, algorithms/heap/topklargest/test_top_k_largest_elements.py
New module implementing k_largest(nums, k=3), kth_largest(nums, k) (heap), and kth_largest_sorting(nums, k) (sorting). Added README with approaches and complexity notes. Added unit tests covering duplicates, various k, and validation behavior (invalid k returns -1 for kth functions).
K Closest to Origin (implementation & tests)
algorithms/heap/topkclosesttoorigin/__init__.py, algorithms/heap/topkclosesttoorigin/README.md, algorithms/heap/topkclosesttoorigin/test_top_k_closest_to_origin.py
New module implementing k_closest_to_origin(points, k) (max-heap via negated distances) and k_closest_to_origin_sorting(points, k) (sort by distance). README documents both heap and sorting strategies and complexities. Tests validate both implementations across multiple scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Poem

🐰 I hopped through heaps both large and small,
Pushed and popped until I found them all.
Closest points and top-K cheer,
Docs and tests make everything clear—
A rabbit's code celebration, springing tall! 🥕✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete and contains a critical inconsistency: it marks the checklist item 'This PR only changes one algorithm file' as completed, but the PR actually adds multiple algorithm implementations (top k largest and top k closest to origin), violating the stated checklist requirement. Clarify whether this PR should be split into separate PRs for each algorithm, or update the checklist to accurately reflect that multiple algorithm implementations are included.
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main changes: adding implementations for top-k and kth largest algorithms using heaps.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@algorithms/heap/topklargest/__init__.py`:
- Around line 72-76: The in-place call nums.sort(reverse=True) mutates the
caller's list; change it to operate on a non-mutating sorted sequence (e.g., use
sorted(nums, reverse=True) or sort a shallow copy) and then return the k-th
largest from that result (replace the nums.sort and return nums[k - 1] usage
with a non-mutating sorted result variable and return that_result[k - 1]);
ensure references to nums.sort(reverse=True) and the subsequent return are
updated accordingly.
- Around line 5-27: k_largest lacks input validation and can IndexError on empty
inputs or invalid k; add guards at the start of k_largest to validate params
(nums and k) consistent with the other variants: if not nums or k <= 0 or k >
len(nums) raise a ValueError with a clear message (e.g., "k must be between 1
and len(nums) and nums must be non-empty"); perform this validation before
building min_heap and iterating.
🧹 Nitpick comments (2)
algorithms/heap/topklargest/__init__.py (1)

30-56: Implementation is correct; consider exception vs sentinel for invalid input.

The heap-based algorithm is correctly implemented with O(n log k) time complexity. One consideration: returning -1 for invalid input could be ambiguous if -1 is a valid element in the array. An alternative would be to raise a ValueError, which is more Pythonic for invalid arguments. However, this is consistent with kth_largest_sorting, so the current approach is acceptable if documented.

algorithms/heap/topklargest/test_top_k_largest_elements.py (1)

6-27: Consider adding edge case tests for invalid inputs.

The current test cases cover happy paths well, but consider adding tests for:

  • Empty list: ([], 1, -1) for kth_largest/kth_largest_sorting
  • k=0: ([1,2,3], 0, -1)
  • k > len(nums): ([1,2], 5, -1)

This would validate the input validation logic and document expected behavior.

📝 Suggested additional test cases
KTH_LARGEST_EDGE_CASES = [
    ([], 1, -1),  # empty list
    ([1, 2, 3], 0, -1),  # k = 0
    ([1, 2], 5, -1),  # k > len(nums)
]
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6cb240c and 0f36bfc.

⛔ Files ignored due to path filters (16)
  • algorithms/heap/topklargest/images/solutions/kth_largest_element_in_array_solution_1.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/kth_largest_element_in_array_solution_10.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/kth_largest_element_in_array_solution_11.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/kth_largest_element_in_array_solution_2.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/kth_largest_element_in_array_solution_3.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/kth_largest_element_in_array_solution_4.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/kth_largest_element_in_array_solution_5.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/kth_largest_element_in_array_solution_6.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/kth_largest_element_in_array_solution_7.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/kth_largest_element_in_array_solution_8.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/kth_largest_element_in_array_solution_9.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/top_k_largest_element_in_array_solution_1.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/top_k_largest_element_in_array_solution_2.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/top_k_largest_element_in_array_solution_3.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/top_k_largest_element_in_array_solution_4.png is excluded by !**/*.png
  • algorithms/heap/topklargest/images/solutions/top_k_largest_element_in_array_solution_5.png is excluded by !**/*.png
📒 Files selected for processing (6)
  • DIRECTORY.md
  • algorithms/heap/topkfrequentelements/README.md
  • algorithms/heap/topkfrequentelements/__init__.py
  • algorithms/heap/topklargest/README.md
  • algorithms/heap/topklargest/__init__.py
  • algorithms/heap/topklargest/test_top_k_largest_elements.py
🧰 Additional context used
🧬 Code graph analysis (1)
algorithms/heap/topklargest/test_top_k_largest_elements.py (1)
algorithms/heap/topklargest/__init__.py (3)
  • k_largest (5-27)
  • kth_largest (30-56)
  • kth_largest_sorting (59-76)
🪛 LanguageTool
algorithms/heap/topklargest/README.md

[style] ~84-~84: Using many exclamation marks might seem excessive (in this case: 11 exclamation marks for a text that’s 3674 characters long)
Context: ...argest_element_in_array_solution_1.png) Solution 2.2 Solution 2.3 Solution 2.4 Solution 2.5 Solution 2.6 Solution 2.7 Solution 2.8 Solution 2.9 ![Solution 2.10](./images/solutions/kth_l...

(EN_EXCESSIVE_EXCLAMATION)

🪛 markdownlint-cli2 (0.18.1)
DIRECTORY.md

143-143: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


144-144: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)

🔇 Additional comments (4)
algorithms/heap/topklargest/README.md (2)

1-46: Documentation is well-structured and accurate.

The explanation of the min-heap approach for Top-K elements is clear, with appropriate complexity analysis. The note about variable k complexity is helpful for understanding the general case.


49-106: LGTM!

The documentation for the Kth Largest Element problem is comprehensive, covering both sorting and min-heap approaches with accurate complexity analysis.

algorithms/heap/topklargest/test_top_k_largest_elements.py (1)

12-16: LGTM!

Good use of sorted() comparison for k_largest since the heap doesn't guarantee order within the result set.

DIRECTORY.md (1)

143-144: LGTM!

The new entry for Topklargest follows the existing indentation pattern in the file. The static analysis warning about indentation can be safely ignored as the style is consistent with other entries in the Heap section (e.g., Schedule Tasks on lines 141-142).

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +5 to +27
def k_largest(nums: List[int], k: int = 3) -> List[int]:
"""
Finds the k largest elements in a given list. K is defaulted to three, but can be used to tweak the k largest
elements in the array
Args:
nums(list): list of elements to check for
k(int): number of elements to check, defaulted to 3
Returns:
list: top k largest elements
"""
# create a minimum heap with the first k elements
min_heap = nums[:k]
heapq.heapify(min_heap)

# iterate through the remaining elements
for num in nums[k:]:
# if the current number is greater than the element at the top of the heap
if num > min_heap[0]:
# Remove it and add this element
heapq.heappushpop(min_heap, num)

# return the top k elements
return min_heap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing input validation can cause runtime errors.

Unlike kth_largest and kth_largest_sorting, this function lacks input validation. This will cause:

  • IndexError when k=0 (empty heap, then accessing min_heap[0])
  • IndexError when nums is empty
  • Unexpected behavior when k > len(nums)

Consider adding consistent validation:

🛠️ Suggested fix
 def k_largest(nums: List[int], k: int = 3) -> List[int]:
     """
     Finds the k largest elements in a given list. K is defaulted to three, but can be used to tweak the k largest
      elements in the array
     Args:
         nums(list): list of elements to check for
         k(int): number of elements to check, defaulted to 3
     Returns:
         list: top k largest elements
     """
+    # input validation to ensure we don't get unexpected results
+    if not nums or k <= 0:
+        return []
+    
+    # Adjust k if it exceeds the length of nums
+    k = min(k, len(nums))
+    
     # create a minimum heap with the first k elements
     min_heap = nums[:k]
     heapq.heapify(min_heap)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def k_largest(nums: List[int], k: int = 3) -> List[int]:
"""
Finds the k largest elements in a given list. K is defaulted to three, but can be used to tweak the k largest
elements in the array
Args:
nums(list): list of elements to check for
k(int): number of elements to check, defaulted to 3
Returns:
list: top k largest elements
"""
# create a minimum heap with the first k elements
min_heap = nums[:k]
heapq.heapify(min_heap)
# iterate through the remaining elements
for num in nums[k:]:
# if the current number is greater than the element at the top of the heap
if num > min_heap[0]:
# Remove it and add this element
heapq.heappushpop(min_heap, num)
# return the top k elements
return min_heap
def k_largest(nums: List[int], k: int = 3) -> List[int]:
"""
Finds the k largest elements in a given list. K is defaulted to three, but can be used to tweak the k largest
elements in the array
Args:
nums(list): list of elements to check for
k(int): number of elements to check, defaulted to 3
Returns:
list: top k largest elements
"""
# input validation to ensure we don't get unexpected results
if not nums or k <= 0:
return []
# Adjust k if it exceeds the length of nums
k = min(k, len(nums))
# create a minimum heap with the first k elements
min_heap = nums[:k]
heapq.heapify(min_heap)
# iterate through the remaining elements
for num in nums[k:]:
# if the current number is greater than the element at the top of the heap
if num > min_heap[0]:
# Remove it and add this element
heapq.heappushpop(min_heap, num)
# return the top k elements
return min_heap
🤖 Prompt for AI Agents
In `@algorithms/heap/topklargest/__init__.py` around lines 5 - 27, k_largest lacks
input validation and can IndexError on empty inputs or invalid k; add guards at
the start of k_largest to validate params (nums and k) consistent with the other
variants: if not nums or k <= 0 or k > len(nums) raise a ValueError with a clear
message (e.g., "k must be between 1 and len(nums) and nums must be non-empty");
perform this validation before building min_heap and iterating.

Comment on lines +72 to +76
# Sort the list in place which incurs a time complexity cost of O(n log(n)). Space complexity is O(n) due to using
# an in-memory data structure to store the elements during sorting using timsort in Python
nums.sort(reverse=True)
# Return the kth largest element
return nums[k - 1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

In-place sort mutates the input list unexpectedly.

The nums.sort(reverse=True) call modifies the caller's list, which may be an unintended side effect. This could cause subtle bugs if the caller expects nums to remain unchanged.

Consider using sorted() to avoid mutation:

🛠️ Suggested fix
-    # Sort the list in place which incurs a time complexity cost of O(n log(n)). Space complexity is O(n) due to using
-    # an in-memory data structure to store the elements during sorting using timsort in Python
-    nums.sort(reverse=True)
+    # Sort the list which incurs a time complexity cost of O(n log(n)). Space complexity is O(n) due to creating
+    # a new sorted list
+    sorted_nums = sorted(nums, reverse=True)
     # Return the kth largest element
-    return nums[k - 1]
+    return sorted_nums[k - 1]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Sort the list in place which incurs a time complexity cost of O(n log(n)). Space complexity is O(n) due to using
# an in-memory data structure to store the elements during sorting using timsort in Python
nums.sort(reverse=True)
# Return the kth largest element
return nums[k - 1]
# Sort the list which incurs a time complexity cost of O(n log(n)). Space complexity is O(n) due to creating
# a new sorted list
sorted_nums = sorted(nums, reverse=True)
# Return the kth largest element
return sorted_nums[k - 1]
🤖 Prompt for AI Agents
In `@algorithms/heap/topklargest/__init__.py` around lines 72 - 76, The in-place
call nums.sort(reverse=True) mutates the caller's list; change it to operate on
a non-mutating sorted sequence (e.g., use sorted(nums, reverse=True) or sort a
shallow copy) and then return the k-th largest from that result (replace the
nums.sort and return nums[k - 1] usage with a non-mutating sorted result
variable and return that_result[k - 1]); ensure references to
nums.sort(reverse=True) and the subsequent return are updated accordingly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@algorithms/heap/topkclosesttoorigin/__init__.py`:
- Around line 31-36: k_closest_to_origin_sorting currently mutates the input by
calling points.sort(); change it to avoid side effects by sorting a copy (use
sorted(points, key=...)) or by copying first (points_copy = list(points);
points_copy.sort(...)) and then return the first k from the sorted copy so the
original input remains unchanged, matching the non-mutating behavior of
k_closest_to_origin.

In `@algorithms/heap/topkclosesttoorigin/README.md`:
- Around line 6-8: The distance formula in the README is garbled; replace the
incorrect expression with correct mathematical notation: for distance from
origin use "sqrt(x² + y²)" (or "√(x² + y²)"), and for distance between two
points use "sqrt((x₁ - x₂)² + (y₁ - y₂)²)" (or "√((x₁ - x₂)² + (y₁ - y₂)²)"),
ensuring variable names match those used elsewhere in the doc (e.g., x,y and a,b
or x₁,y₁ and x₂,y₂) and remove the malformed `√(x1 - a2)2 + (y1 - b2)2` text.

In `@algorithms/heap/topkclosesttoorigin/test_top_k_closest_to_origin.py`:
- Around line 23-25: The test sorts results only by x-coordinate which can be
flaky when x values tie; update the sorting key in
test_top_k_closest_to_origin.py to use both coordinates (e.g., key that returns
a tuple of (x, y)) for sorted_expected and sorted_actual in the block using
sorted_expected/sorted_actual and apply the same change in
test_top_k_closest_to_origin_sorting so comparisons are deterministic.
🧹 Nitpick comments (1)
algorithms/heap/topkclosesttoorigin/__init__.py (1)

5-28: Consider adding edge case validation.

The heap-based implementation is correct and efficient. However, there's no validation for edge cases:

  • k <= 0 would return an empty list (acceptable but undocumented)
  • k > len(points) would return all points (acceptable)
  • Empty points list would return empty (acceptable)

These behaviors may be intentional, but documenting them or adding explicit validation would improve robustness.

🔧 Optional: Add input validation
 def k_closest_to_origin(points: List[List[int]], k: int) -> List[List[int]]:
+    if not points or k <= 0:
+        return []
+    k = min(k, len(points))
+
     # Max heap will store the top k points closest to the origin in the form (-distance, idx). The distance is negated
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f36bfc and 2bb4e1f.

⛔ Files ignored due to path filters (13)
  • algorithms/heap/topkclosesttoorigin/images/examples/top_k_closest_to_origin_example_1.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_0.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_1.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_10.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_11.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_2.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_3.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_4.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_5.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_6.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_7.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_8.png is excluded by !**/*.png
  • algorithms/heap/topkclosesttoorigin/images/solutions/top_k_closest_to_origin_solution_9.png is excluded by !**/*.png
📒 Files selected for processing (4)
  • DIRECTORY.md
  • algorithms/heap/topkclosesttoorigin/README.md
  • algorithms/heap/topkclosesttoorigin/__init__.py
  • algorithms/heap/topkclosesttoorigin/test_top_k_closest_to_origin.py
🧰 Additional context used
🧬 Code graph analysis (1)
algorithms/heap/topkclosesttoorigin/test_top_k_closest_to_origin.py (1)
algorithms/heap/topkclosesttoorigin/__init__.py (2)
  • k_closest_to_origin (5-28)
  • k_closest_to_origin_sorting (31-36)
🪛 LanguageTool
algorithms/heap/topkclosesttoorigin/README.md

[style] ~82-~82: Using many exclamation marks might seem excessive (in this case: 10 exclamation marks for a text that’s 3221 characters long)
Context: ...top_k_closest_to_origin_solution_0.png) Solution 1 Solution 2 Solution 3 Solution 4 Solution 5 Solution 6 Solution 7 Solution 8 Solution 9 ![Solution 10](./images/solutions/top_k_c...

(EN_EXCESSIVE_EXCLAMATION)

🪛 markdownlint-cli2 (0.18.1)
DIRECTORY.md

143-143: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


144-144: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)


145-145: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


146-146: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)

🔇 Additional comments (3)
DIRECTORY.md (1)

143-146: LGTM!

The new directory entries for Topkclosesttoorigin and Topklargest are correctly added and follow the existing indentation pattern used throughout the file. The static analysis hints about indentation (MD007) are false positives since the file consistently uses 4-space and 6-space indentation for nested list items.

algorithms/heap/topkclosesttoorigin/README.md (1)

56-104: LGTM!

The algorithm explanation and complexity analysis are well-written and accurate. The cross-reference to the related topklargest implementation provides helpful context.

algorithms/heap/topkclosesttoorigin/test_top_k_closest_to_origin.py (1)

9-13: Test coverage is adequate for core functionality.

The test cases cover the basic scenarios. Consider adding edge cases in future iterations:

  • k = 0
  • k > len(points)
  • Points with identical distances to origin

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +31 to +36
def k_closest_to_origin_sorting(points: List[List[int]], k: int) -> List[List[int]]:
# Sort the points by the distance from the origin. This incurs a cost of O(n log(n)) and space cost of O(n) due to
# timsort
points.sort(key=lambda p: p[0] ** 2 + p[1] ** 2)
# Retrieve the top k points closest to the origin
return points[:k]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Input list is mutated unexpectedly.

points.sort() modifies the input list in-place. This is a side effect that differs from k_closest_to_origin (which preserves the input). Callers may not expect their original list to be reordered after calling this function.

🐛 Proposed fix to avoid mutating input
 def k_closest_to_origin_sorting(points: List[List[int]], k: int) -> List[List[int]]:
     # Sort the points by the distance from the origin. This incurs a cost of O(n log(n)) and space cost of O(n) due to
     # timsort
-    points.sort(key=lambda p: p[0] ** 2 + p[1] ** 2)
-    # Retrieve the top k points closest to the origin
-    return points[:k]
+    sorted_points = sorted(points, key=lambda p: p[0] ** 2 + p[1] ** 2)
+    # Retrieve the top k points closest to the origin
+    return sorted_points[:k]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def k_closest_to_origin_sorting(points: List[List[int]], k: int) -> List[List[int]]:
# Sort the points by the distance from the origin. This incurs a cost of O(n log(n)) and space cost of O(n) due to
# timsort
points.sort(key=lambda p: p[0] ** 2 + p[1] ** 2)
# Retrieve the top k points closest to the origin
return points[:k]
def k_closest_to_origin_sorting(points: List[List[int]], k: int) -> List[List[int]]:
# Sort the points by the distance from the origin. This incurs a cost of O(n log(n)) and space cost of O(n) due to
# timsort
sorted_points = sorted(points, key=lambda p: p[0] ** 2 + p[1] ** 2)
# Retrieve the top k points closest to the origin
return sorted_points[:k]
🤖 Prompt for AI Agents
In `@algorithms/heap/topkclosesttoorigin/__init__.py` around lines 31 - 36,
k_closest_to_origin_sorting currently mutates the input by calling
points.sort(); change it to avoid side effects by sorting a copy (use
sorted(points, key=...)) or by copying first (points_copy = list(points);
points_copy.sort(...)) and then return the first k from the sorted copy so the
original input remains unchanged, matching the non-mutating behavior of
k_closest_to_origin.

Comment on lines +6 to +8
The distance between two points (x, y) and (a, b) is calculated using the formula:

√(x1 - a2)2 + (y1 - b2)2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Distance formula notation is incorrect.

The formula √(x1 - a2)2 + (y1 - b2)2 appears garbled. For distance from origin (0, 0), it should be:

√(x² + y²) or equivalently sqrt(x² + y²)

For general distance between two points (x₁, y₁) and (x₂, y₂):

√((x₁ - x₂)² + (y₁ - y₂)²)

📝 Suggested fix
 The distance between two points (x, y) and (a, b) is calculated using the formula:

-√(x1 - a2)2 + (y1 - b2)2
+√((x - a)² + (y - b)²)
+
+For distance from the origin (0, 0), this simplifies to: √(x² + y²)
🤖 Prompt for AI Agents
In `@algorithms/heap/topkclosesttoorigin/README.md` around lines 6 - 8, The
distance formula in the README is garbled; replace the incorrect expression with
correct mathematical notation: for distance from origin use "sqrt(x² + y²)" (or
"√(x² + y²)"), and for distance between two points use "sqrt((x₁ - x₂)² + (y₁ -
y₂)²)" (or "√((x₁ - x₂)² + (y₁ - y₂)²)"), ensuring variable names match those
used elsewhere in the doc (e.g., x,y and a,b or x₁,y₁ and x₂,y₂) and remove the
malformed `√(x1 - a2)2 + (y1 - b2)2` text.

Comment on lines +23 to +25
sorted_expected = sorted(expected, key=lambda x: x[0])
sorted_actual = sorted(actual, key=lambda x: x[0])
self.assertEqual(sorted_expected, sorted_actual)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Sorting key may cause flaky comparisons.

Sorting only by x[0] could produce non-deterministic results when two points share the same x-coordinate. For robust comparison, sort by both coordinates.

🐛 Proposed fix
-        sorted_expected = sorted(expected, key=lambda x: x[0])
-        sorted_actual = sorted(actual, key=lambda x: x[0])
+        sorted_expected = sorted(expected, key=lambda x: (x[0], x[1]))
+        sorted_actual = sorted(actual, key=lambda x: (x[0], x[1]))

The same fix should be applied to test_top_k_closest_to_origin_sorting at lines 33-34.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sorted_expected = sorted(expected, key=lambda x: x[0])
sorted_actual = sorted(actual, key=lambda x: x[0])
self.assertEqual(sorted_expected, sorted_actual)
sorted_expected = sorted(expected, key=lambda x: (x[0], x[1]))
sorted_actual = sorted(actual, key=lambda x: (x[0], x[1]))
self.assertEqual(sorted_expected, sorted_actual)
🤖 Prompt for AI Agents
In `@algorithms/heap/topkclosesttoorigin/test_top_k_closest_to_origin.py` around
lines 23 - 25, The test sorts results only by x-coordinate which can be flaky
when x values tie; update the sorting key in test_top_k_closest_to_origin.py to
use both coordinates (e.g., key that returns a tuple of (x, y)) for
sorted_expected and sorted_actual in the block using
sorted_expected/sorted_actual and apply the same change in
test_top_k_closest_to_origin_sorting so comparisons are deterministic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Algorithm Algorithm Problem Array Array data structure Binary Tree Datastructures Datastructures Documentation Documentation Updates enhancement Heap Sorting Contains sorting in the algorithm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants