Skip to content

973. K Closest Points to Origin (Medium)

Problem

Given an array points where points[i] = [xᵢ, yᵢ], return the k points closest to the origin (0, 0). The distance metric is Euclidean; squared distance is sufficient for ranking (no need for sqrt).

Example

  • points = [[1,3],[-2,2]], k = 1[[-2, 2]]
  • points = [[3,3],[5,-1],[-2,4]], k = 2[[3, 3], [-2, 4]]

LeetCode 973 · Link · Medium

Approach 1: Brute force, sort by squared distance

Sort all points; take the first k.

def k_closest(points, k):
return sorted(points, key=lambda p: p[0] ** 2 + p[1] ** 2)[:k] # L1: O(n log n)

Where the time goes, line by line

Variables: n = number of points, k = number of closest points to return.

LinePer-call costTimes executedContribution
L1 (sort)O(n log n)1O(n log n) ← dominates

Complexity

  • Time: O(n log n), driven by L1.
  • Space: O(n).

Simple and often accepted.

Approach 2: Size-K max-heap

Keep a max-heap of size K. Each new point is pushed; if the heap exceeds K, pop the farthest.

import heapq
def k_closest(points, k):
# Max-heap via negated distance
heap = []
for x, y in points: # L1: iterate n points
d = -(x * x + y * y)
if len(heap) < k:
heapq.heappush(heap, (d, x, y)) # L2: O(log k) push
elif d > heap[0][0]:
heapq.heapreplace(heap, (d, x, y)) # L3: O(log k) replace farthest
return [[x, y] for _, x, y in heap]

Where the time goes, line by line

Variables: n = number of points, k = number of closest points to return.

LinePer-call costTimes executedContribution
L1 (loop)O(1)nO(n)
L2 (heappush)O(log k)up to kO(k log k)
L3 (heapreplace)O(log k)up to n - kO(n log k) ← dominates

For the first k points, each push costs O(log k). For the remaining n - k points, each potential replace also costs O(log k). Total: O(n log k).

Complexity

  • Time: O(n log k), driven by L2/L3.
  • Space: O(k).

Strictly better than Approach 1 when k ≪ n.

Approach 3: Quickselect (optimal average)

Partition the array in place so the first k elements are the k closest (unordered). Average O(n).

def k_closest(points, k):
def dist(p):
return p[0] ** 2 + p[1] ** 2
def partition(lo, hi): # L1: O(hi - lo) per call
pivot = dist(points[hi])
store = lo
for i in range(lo, hi):
if dist(points[i]) <= pivot:
points[store], points[i] = points[i], points[store]
store += 1
points[store], points[hi] = points[hi], points[store]
return store
def quickselect(lo, hi, k):
if lo >= hi: # L2: base case
return
p = partition(lo, hi) # L3: O(subarray size)
if p == k:
return
if p < k:
quickselect(p + 1, hi, k) # L4: recurse right
else:
quickselect(lo, p - 1, k) # L5: recurse left
quickselect(0, len(points) - 1, k)
return points[:k]

Where the time goes, line by line

Variables: n = number of points, k = number of closest points to return.

LinePer-call costTimes executedContribution
L1-L3 (partition)O(subarray size)log n avg roundsO(n) avg
L4 or L5 (recurse)O(1) per framelog n avgO(n) avg ← dominates

On average each partition halves the search space: n + n/2 + n/4 + … = 2n = O(n). Worst case O(n²) when pivot is always the extreme (use random pivot to mitigate).

Complexity

  • Time: O(n) average, O(n²) worst case (L1-L3 partition cost).
  • Space: O(log n) recursion depth.

Pick quickselect when you’re allowed to mutate the input and want the tightest average complexity.

Test cases

# Quick smoke tests, paste into a REPL or save as test_973.py and run.
# Uses the size-K max-heap approach (Approach 2).
import heapq
def k_closest(points, k):
heap = []
for x, y in points:
d = -(x * x + y * y)
if len(heap) < k:
heapq.heappush(heap, (d, x, y))
elif d > heap[0][0]:
heapq.heapreplace(heap, (d, x, y))
return [[x, y] for _, x, y in heap]
def _run_tests():
# Example 1: k=1, closest is [-2,2] (dist=8 vs 10)
result = k_closest([[1,3],[-2,2]], 1)
assert result == [[-2,2]], f"got {result}"
# Example 2: k=2, answer is [[3,3],[-2,4]] (order doesn't matter)
result = k_closest([[3,3],[5,-1],[-2,4]], 2)
assert sorted(result) == sorted([[3,3],[-2,4]]), f"got {result}"
# Single point, k=1
assert k_closest([[0,0]], 1) == [[0,0]]
# All equidistant: any k points valid
result = k_closest([[1,0],[-1,0],[0,1],[0,-1]], 2)
assert len(result) == 2
# k equals n: return all
pts = [[1,2],[3,4],[0,0]]
result = k_closest(pts, 3)
assert len(result) == 3
print("all tests pass")
if __name__ == "__main__":
_run_tests()

Summary

ApproachTimeSpace
SortO(n log n)O(n)
Size-K max-heapO(n log k)O(k)
QuickselectO(n) avg / O(n²) worstO(log n)

The heap version is the canonical interview answer. Quickselect is the “optimal-average” answer, know it for when the interviewer pushes on tighter bounds.