Binary search & binary search trees

Searching is one of computing’s oldest problems, and the trick to doing it well is almost always the same: exploit order. If data is arranged so that you can rule out large chunks with a single comparison, you can find things astonishingly fast. Binary search applies this idea to sorted arrays, and the binary search tree generalizes it into a living, mutable structure that stays ordered as data comes and goes.

Binary search: halving the haystack

Binary search works on a sorted array. To find a target, you compare it to the element in the middle. If the target is smaller, it can only be in the left half, so you throw the right half away — and vice versa. You repeat on the surviving half until you either find the target or run out of elements.

The power lies in what each comparison buys you. A naive linear search checks elements one at a time, needing up to n comparisons. Binary search eliminates half of the remaining candidates every step, so it needs only about log n comparisons. For an array of a billion elements, that is roughly 30 comparisons instead of a billion — the difference between instant and unusable.

The one catch: the data must be sorted by the key you are searching. Binary search cannot run on unordered data, because it has no way to know which half to discard.

From array to tree

Sorted arrays are fast to search but slow to update — inserting into the middle means shifting everything after it. The binary search tree (BST) keeps the logarithmic search speed while making insertion and deletion cheap.

A BST is a tree where each node holds a key and has up to two children. It obeys a single ordering invariant: for every node, all keys in its left subtree are smaller, and all keys in its right subtree are larger. This rule holds at every node, not just the root.

Searching mirrors binary search exactly. Start at the root and compare. Smaller? Go left. Larger? Go right. Equal? You found it. Each step descends one level, discarding an entire subtree, so a well-shaped tree is searched in O(log n).

The magic of in-order traversal

The ordering invariant gives the BST a delightful property. If you traverse it in-order — visit the left subtree, then the node itself, then the right subtree — you visit the keys in sorted ascending order, for free. This is why a BST can answer questions a hash table cannot: range queries, finding the next-largest key, or printing everything in order.

        8
       / \
      3   10
     / \    \
    1   6    14

An in-order walk of this tree yields 1, 3, 6, 8, 10, 14.

Balanced versus degenerate

Here is the crucial caveat. The O(log n) guarantee depends on the tree being balanced — its height staying close to log n because the two sides stay roughly equal in depth. If you insert keys in already-sorted order (1, then 2, then 3…), each new key hangs off the right of the previous one, and the tree collapses into a degenerate chain. That chain is just a linked list in disguise: its height is n, and search degrades all the way back to O(n).

This failure mode is common enough that it motivated an entire family of self-balancing trees — AVL trees and red-black trees being the best known. These structures detect imbalance during insertion and deletion and perform small local restructurings, called rotations, to keep the height near log n. The result is a tree that guarantees O(log n) operations regardless of insertion order.

Choosing a tree over a hash table

Hash tables also offer fast lookups — O(1) on average — so when should you prefer a balanced BST? The answer is whenever order matters. A BST keeps its keys sorted, which lets it efficiently support range queries (“give me every key between 10 and 50”), ordered iteration, and predecessor or successor lookups. A hash table scatters keys by their hash and can do none of these efficiently. The two structures are complementary tools: reach for the hash table when you only need point lookups, and the balanced tree when ordering is part of the problem.

Binary search: halving the haystack

From array to tree

The magic of in-order traversal

Balanced versus degenerate

Choosing a tree over a hash table

Sources