973. K Closest Points to Origin (Medium)

Problem

Given an array points where points[i] = [xᵢ, yᵢ], return the k points closest to the origin (0, 0). The distance metric is Euclidean; squared distance is sufficient for ranking (no need for sqrt).

Example

points = [[1,3],[-2,2]], k = 1 → [[-2, 2]]
points = [[3,3],[5,-1],[-2,4]], k = 2 → [[3, 3], [-2, 4]]

LeetCode 973 · Link · Medium

Approach 1: Brute force, sort by squared distance

Sort all points; take the first k.

def k_closest(points, k):
    return sorted(points, key=lambda p: p[0] ** 2 + p[1] ** 2)[:k]  # L1: O(n log n)

Where the time goes, line by line

Variables: n = number of points, k = number of closest points to return.

Line	Per-call cost	Times executed	Contribution
L1 (sort)	O(n log n)	1	O(n log n) ← dominates

Complexity

Time: O(n log n), driven by L1.
Space: O(n).

Simple and often accepted.

Approach 2: Size-K max-heap

Keep a max-heap of size K. Each new point is pushed; if the heap exceeds K, pop the farthest.

import heapq

def k_closest(points, k):
    # Max-heap via negated distance
    heap = []
    for x, y in points:                       # L1: iterate n points
        d = -(x * x + y * y)
        if len(heap) < k:
            heapq.heappush(heap, (d, x, y))   # L2: O(log k) push
        elif d > heap[0][0]:
            heapq.heapreplace(heap, (d, x, y)) # L3: O(log k) replace farthest
    return [[x, y] for _, x, y in heap]

Where the time goes, line by line

Variables: n = number of points, k = number of closest points to return.

Line	Per-call cost	Times executed	Contribution
L1 (loop)	O(1)	n	O(n)
L2 (heappush)	O(log k)	up to k	O(k log k)
L3 (heapreplace)	O(log k)	up to n - k	O(n log k) ← dominates

For the first k points, each push costs O(log k). For the remaining n - k points, each potential replace also costs O(log k). Total: O(n log k).

Complexity

Time: O(n log k), driven by L2/L3.
Space: O(k).

Strictly better than Approach 1 when k ≪ n.

Approach 3: Quickselect (optimal average)

Partition the array in place so the first k elements are the k closest (unordered). Average O(n).

def k_closest(points, k):
    def dist(p):
        return p[0] ** 2 + p[1] ** 2

    def partition(lo, hi):                           # L1: O(hi - lo) per call
        pivot = dist(points[hi])
        store = lo
        for i in range(lo, hi):
            if dist(points[i]) <= pivot:
                points[store], points[i] = points[i], points[store]
                store += 1
        points[store], points[hi] = points[hi], points[store]
        return store

    def quickselect(lo, hi, k):
        if lo >= hi:                                 # L2: base case
            return
        p = partition(lo, hi)                        # L3: O(subarray size)
        if p == k:
            return
        if p < k:
            quickselect(p + 1, hi, k)               # L4: recurse right
        else:
            quickselect(lo, p - 1, k)               # L5: recurse left

    quickselect(0, len(points) - 1, k)
    return points[:k]

Where the time goes, line by line

Variables: n = number of points, k = number of closest points to return.

Line	Per-call cost	Times executed	Contribution
L1-L3 (partition)	O(subarray size)	log n avg rounds	O(n) avg
L4 or L5 (recurse)	O(1) per frame	log n avg	O(n) avg ← dominates

On average each partition halves the search space: n + n/2 + n/4 + … = 2n = O(n). Worst case O(n²) when pivot is always the extreme (use random pivot to mitigate).

Complexity

Time: O(n) average, O(n²) worst case (L1-L3 partition cost).
Space: O(log n) recursion depth.

Pick quickselect when you’re allowed to mutate the input and want the tightest average complexity.

Test cases

# Quick smoke tests, paste into a REPL or save as test_973.py and run.
# Uses the size-K max-heap approach (Approach 2).
import heapq

def k_closest(points, k):
    heap = []
    for x, y in points:
        d = -(x * x + y * y)
        if len(heap) < k:
            heapq.heappush(heap, (d, x, y))
        elif d > heap[0][0]:
            heapq.heapreplace(heap, (d, x, y))
    return [[x, y] for _, x, y in heap]

def _run_tests():
    # Example 1: k=1, closest is [-2,2] (dist=8 vs 10)
    result = k_closest([[1,3],[-2,2]], 1)
    assert result == [[-2,2]], f"got {result}"

    # Example 2: k=2, answer is [[3,3],[-2,4]] (order doesn't matter)
    result = k_closest([[3,3],[5,-1],[-2,4]], 2)
    assert sorted(result) == sorted([[3,3],[-2,4]]), f"got {result}"

    # Single point, k=1
    assert k_closest([[0,0]], 1) == [[0,0]]

    # All equidistant: any k points valid
    result = k_closest([[1,0],[-1,0],[0,1],[0,-1]], 2)
    assert len(result) == 2

    # k equals n: return all
    pts = [[1,2],[3,4],[0,0]]
    result = k_closest(pts, 3)
    assert len(result) == 3

    print("all tests pass")

if __name__ == "__main__":
    _run_tests()

Summary

Approach	Time	Space
Sort	O(n log n)	O(n)
Size-K max-heap	O(n log k)	O(k)
Quickselect	O(n) avg / O(n²) worst	O(log n)

The heap version is the canonical interview answer. Quickselect is the “optimal-average” answer, know it for when the interviewer pushes on tighter bounds.

Heaps / Priority Queues, size-K top/bottom heap template
Arrays, in-place partitioning for quickselect

973. K Closest Points to Origin (Medium)

Problem

Approach 1: Brute force, sort by squared distance

Approach 2: Size-K max-heap

Approach 3: Quickselect (optimal average)

Test cases

Summary

Related data structures