Searching, Sorting & Hashing

Binary Search

Halve a sorted list each step for fast searching — O(log n) instead of O(n). Iterative and recursive versions, trace tables, performance analysis, common bugs, and an interactive simulator showing how the range narrows with each comparison.

13.1 How Binary Search Works

A linear search has to look at every element one by one. For a list of a million items, that's potentially a million comparisons. Binary search finishes the same job in about twenty comparisons. The cost is that the list must already be sorted.

Suppose someone asks: "Is the number 23 in this sorted list?"

List:  [ 3, 7, 11, 15, 19, 23, 27, 31, 35 ]
Index:   0  1   2   3   4   5   6   7   8

# Step 1: mid = (0+8)//2 = 4 -> arr[4] = 19. 19 < 23, so discard left half.
#          low = 5, high = 8
# Step 2: mid = (5+8)//2 = 6 -> arr[6] = 27. 27 > 23, so discard right half.
#          low = 5, high = 5
# Step 3: mid = (5+5)//2 = 5 -> arr[5] = 23. 23 == 23 -> FOUND at index 5!
Key idea:
  • Binary search compares the middle element to the target.
  • If the middle is too small, discard the left half; if too big, discard the right half.
  • Repeat on the remaining half until found or the range is empty.
Your Turn — Describe the search in 4 steps [4 marks]
Describe, using four steps, how a binary search would find the value 11 in the sorted list [2, 5, 8, 11, 14, 17, 20]. Do not write pseudocode.
Hint:
  • Set Low = 0, High = 6
  • Mid = (0+6)//2 = 3, arr[3] = 11. 11 == 11, found!
  • Only takes 1 comparison — the target is at the middle
Your Turn — Find 14 in 4 steps [4 marks]
Describe, using four steps, how a binary search would find the value 14 in the sorted list [2, 5, 8, 11, 14, 17, 20]. Do not write pseudocode.
Hint:
  • First Mid = 3, arr[3] = 11. 11 < 14, go right
  • Low = 4, High = 6. Mid = 5, arr[5] = 17. 17 > 14, go left
  • Low = 4, High = 4. Mid = 4, arr[4] = 14. Match!

13.1B Interactive Simulator

Try binary search yourself. Change the sorted array and target, then click Start. Use Step for manual control or Play for auto-run. Watch how the range (low/high) narrows and elements get discarded (greyed out) with each comparison.

Interactive Simulator

Simulation

Target:23
3
0
7
1
11
2
15
3
19
4
23
5
27
6
31
7
35
8

Status: Click Start to begin.

Code

1low = 0
2high = len(data) - 1
3while low <= high:
4 mid = (low + high) // 2
5 if data[mid] == target:
6 return mid
7 elif data[mid] < target:
8 low = mid + 1
9 else:
10 high = mid - 1
11return -1
Speed:

13.2 Iterative Binary Search — Pseudocode & Python

Now we turn the description into code. The Cambridge pseudocode below is almost identical to the one in the published mark scheme for 9618/22 M/J 2025 Q7(a).

Cambridge pseudocode

FUNCTION BinarySearch(BYREF dataList : ARRAY OF INTEGER, target : INTEGER) RETURNS INTEGER
    DECLARE low, high, mid : INTEGER
    low <- 0
    high <- LENGTH(dataList) - 1

    WHILE low <= high
        mid <- (low + high) DIV 2
        IF dataList[mid] = target THEN
            RETURN mid
        ELSE
            IF dataList[mid] < target THEN
                low <- mid + 1
            ELSE
                high <- mid - 1
            ENDIF
        ENDIF
    ENDWHILE
    RETURN -1
ENDFUNCTION

Python

def binary_search(data_list, target):
    low = 0
    high = len(data_list) - 1

    while low <= high:
        mid = (low + high) // 2
        if data_list[mid] == target:
            return mid
        elif data_list[mid] < target:
            low = mid + 1
        else:
            high = mid - 1

    return -1  # not found
Key rules:
  • // is Python's integer division (same as DIV in pseudocode) — needed for a valid array index.
  • Use mid + 1 and mid - 1 (not just mid) to avoid infinite loops — mid has already been checked.
  • The loop condition is low <= high (not <) — when low == high, one element remains to check.
Your Turn — contains — return True/False
Write a Python function contains(numbers, n) that returns True if n is in the sorted list numbers, or False if not. Use binary search.
Hint:
  • Same structure as binary_search but return True/False
  • Return True when numbers[mid] == n
  • Return False after the loop (not found)

13.3 Tracing a Binary Search

A trace table tracks low, high, mid, and the comparison result for each iteration. This is what examiners ask you to fill in on Paper 4.

Iterationlowhighmidarr[mid]ComparisonAction
10631111 == 11Found! Return 3
Search complete
Exam tip:
  • In a trace table question, each row shows one iteration of the while loop.
  • Fill in low, high, mid (calculated as (low+high)//2), arr[mid], the comparison result, and the action taken (go left, go right, or found).
Your Turn — Trace table for target 25
Trace the binary search on B = [5, 10, 15, 20, 25, 30, 35, 40] looking for target 25. Fill in each row.
Hint:
  • Low=0, High=7. Mid=3, arr[3]=20. 20 < 25, go right
  • Low=4, High=7. Mid=5, arr[5]=30. 30 > 25, go left
  • Low=4, High=4. Mid=4, arr[4]=25. 25 == 25, found!
Your Turn — Trace for target not in list
Trace the binary search on B = [5, 10, 15, 20, 25, 30, 35, 40] looking for target 18 (which is not in the list).
Hint:
  • Low=0, High=7. Mid=3, arr[3]=20. 20 > 18, go left
  • Low=0, High=2. Mid=1, arr[1]=10. 10 < 18, go right
  • Low=2, High=2. Mid=2, arr[2]=15. 15 < 18, go right
  • Low=3, High=2. Low > High, loop ends. Return -1.

13.4 Recursive Binary Search

Recursion replaces the WHILE loop with the function calling itself, each time on a smaller range. Instead of moving Low and High with assignments, we pass new values as parameters.

Cambridge pseudocode (recursive)

FUNCTION RecursiveBinarySearch(
    BYREF dataList : ARRAY OF INTEGER,
    low : INTEGER, high : INTEGER,
    target : INTEGER) RETURNS INTEGER

    DECLARE mid : INTEGER

    IF low > high THEN
        RETURN -1  // base case: not found
    ENDIF

    mid <- (low + high) DIV 2

    IF dataList[mid] = target THEN
        RETURN mid  // base case: found
    ELSE
        IF dataList[mid] < target THEN
            RETURN RecursiveBinarySearch(
                dataList, mid + 1, high, target)
        ELSE
            RETURN RecursiveBinarySearch(
                dataList, low, mid - 1, target)
        ENDIF
    ENDIF
ENDFUNCTION

Python (recursive)

def recursive_binary_search(
        data_list, low, high, target):

    if low > high:  # base case: empty
        return -1

    mid = (low + high) // 2

    if data_list[mid] == target:
        return mid  # base case: found
    elif data_list[mid] < target:
        return recursive_binary_search(
            data_list, mid + 1, high, target)
    else:
        return recursive_binary_search(
            data_list, low, mid - 1, target)
Two base cases:
  • Found: data[mid] == target → return mid
  • Not found: low > high → return -1 (empty range)
Your Turn — Missing base case [3 marks]
A student writes a recursive binary search but forgets the IF low > high THEN RETURN -1 check. Describe what happens when the function is called on a list that does not contain the target.
Hint:
  • Without the base case, the function keeps recursing
  • The call stack fills up
  • Program crashes with stack overflow / RecursionError

13.5 Performance and Conditions

Each iteration of binary search halves the remaining range:

after 1 step: n / 2 elements left
after 2 steps: n / 4 elements left
after 3 steps: n / 8 elements left
after k steps: n / 2^k elements left

The loop ends when n / 2^k <= 1, so k ≈ log2(n).
List size nLinear search (worst)Binary search (worst)
10104
1,0001,00010
1,000,0001,000,00020
1,000,000,0001,000,000,00030
Conditions for binary search:
  • 1. The data must be sorted. Without sorting, the halving logic does not work.
  • 2. The data structure must support direct access by index. Arrays/lists work; linked lists do not (you cannot jump to the middle).
Your Turn — Doubling the list [3 marks]
A sorted array of 32 items is doubled to 64, then doubled again to 128. How many extra worst-case comparisons does binary search need at each doubling? Justify briefly.
Hint:
  • log2(32) = 5
  • log2(64) = 6
  • log2(128) = 7
  • Each doubling adds exactly 1 comparison

13.6 Debugging a Buggy Version

Study the pseudocode below. It is supposed to perform a binary search, but it contains four errors. Find each one, give the line number, explain what is wrong, and write the correction.

01  FUNCTION BinarySearch(BYREF B : ARRAY OF INTEGER, target : INTEGER) RETURNS INTEGER
02      DECLARE low, high, mid : INTEGER
03      low <- 0
04      high <- LENGTH(B)                    // ERROR 1
05      WHILE low < high                     // ERROR 2
06          mid <- (low + high) DIV 2
07          IF B[mid] = target THEN
08              RETURN mid
09          ELSE
10              IF B[mid] < target THEN
11                  low <- mid               // ERROR 3
12              ELSE
13                  high <- mid              // ERROR 4
14              ENDIF
15          ENDIF
16      ENDWHILE
17      RETURN -1
18  ENDFUNCTION
Your Turn — Find the 4 errors [4 marks]
Find all four errors in the buggy binary search pseudocode. For each: give the line number, explain what is wrong, and write the correction.
Hint:
  • Line 4: high should be LENGTH(B) - 1 (last valid index)
  • Line 5: should be WHILE low <= high (not <)
  • Line 11: should be low = mid + 1 (not mid, to avoid infinite loop)
  • Line 13: should be high = mid - 1 (not mid, to avoid infinite loop)
Your Turn — / vs DIV [3 marks]
A student replaces line 6 with mid <- (low + high) / 2 (using / instead of DIV). Explain what would go wrong, and state the correction.
Hint:
  • / produces a REAL (float), e.g. 3.0
  • Array indices must be INTEGER
  • B[3.0] would cause a type error / crash

13.7 Full Exam-Style Question

A school's library system stores 10,000 book records in a sorted 1D array Catalogue indexed from 0 to 9999. Each element holds a unique 5-digit ISBN (integer). The system uses binary search to find a book by its ISBN.

(a) — Two conditions [2 marks] State the two conditions that must hold for binary search to be used on Catalogue.
Your Turn — (a) Model answer
State the two conditions for binary search. [2 marks]
Hint:
  • The data must be sorted
  • The data structure must support direct/random access by index
(b) — FindBook function [6 marks] Write pseudocode for a function FindBook(target : INTEGER) RETURNS INTEGER that performs an iterative binary search on the global array Catalogue and returns the index where target is found, or -1 if not present.
Your Turn — (b) Model answer
Write pseudocode for FindBook(target) that performs binary search on global Catalogue. [6 marks]
Hint:
  • low = 0, high = LENGTH(Catalogue) - 1
  • WHILE low <= high: mid = (low+high) DIV 2
  • Compare Catalogue[mid] to target
  • low = mid+1 or high = mid-1
  • Return mid if found, -1 after loop
(c) — Max comparisons [2 marks] State the maximum number of comparisons needed in the worst case. Justify your answer.
Your Turn — (c) Model answer
State the maximum comparisons for 10,000 items. Justify. [2 marks]
Hint:
  • log2(10000) ≈ 13.29
  • So at most 14 comparisons
  • Binary search is O(log n)
(d) — Binary vs linear [2 marks] Give one reason why binary search is preferable for this catalogue, and one situation in which linear search would still be the right choice.
Your Turn — (d) Model answer
Give one reason binary search is preferable, and one situation where linear search is better. [2 marks]
Hint:
  • Binary: much faster for large sorted lists (O(log n) vs O(n))
  • Linear: better for unsorted data or very small lists

Key Points Summary

Binary search requires SORTED data and direct index access.
It compares the MIDDLE element to the target and discards half the list each step.
mid = (low + high) // 2 — use integer division for a valid index.
If arr[mid] < target: low = mid + 1 (search right half).
If arr[mid] > target: high = mid - 1 (search left half).
Use mid + 1 and mid - 1 (NOT mid) to avoid infinite loops.
Loop condition: while low <= high (not <, to check the last element).
Return mid if found; return -1 if the loop ends (not found).
Recursive version has two base cases: found at mid, or low > high (return -1).
Without the base case, recursive binary search crashes with stack overflow.
Big O: O(log n) — each step halves the range.
Doubling the list adds exactly 1 extra comparison (property of O(log n)).
For 1,000,000 items: ~20 comparisons. For 10,000,000: ~24.
Four common bugs: high=LENGTH (should be -1), low<high (should be <=), low=mid (should be +1), high=mid (should be -1).

13.8 Practice Tasks

Fifteen exam-style binary search tasks. Each shows only the question — click Hint for the thought process, or Help for the worked solution.

Exam tip:
  • Correct initialisation (low=0, high=len-1) and loop condition (low <= high).
  • Correct mid calculation with integer division: (low+high)//2.
  • Correct comparison and range update: mid+1 / mid-1.
  • Return mid if found, -1 if not found.
  • The most common mark loss is using mid instead of mid+1/mid-1.
1Practice Task — Iterative binary search [6 marks]
Write a Python function binary_search(data, target) that returns the index of target in a sorted list, or -1 if not found.
2Practice Task — contains — True/False [4 marks]
Write a function contains(numbers, n) that returns True if n is in the sorted list, False if not. Use binary search.
3Practice Task — Recursive binary search [6 marks]
Write a Python recursive function recursive_binary_search(data, low, high, target) that returns the index or -1.
4Practice Task — Describe in 4 steps [4 marks]
Describe, using four steps, how binary search finds 23 in [3, 7, 11, 15, 19, 23, 27, 31, 35]. Do not write code.
5Practice Task — Trace table [6 marks]
Trace binary search on B = [2, 4, 6, 8, 10, 12, 14] for target 10. Show low, high, mid, arr[mid] for each iteration.
6Practice Task — Trace — not found [6 marks]
Trace binary search on B = [5, 10, 15, 20, 25, 30, 35, 40] for target 18 (not in list).
7Practice Task — Big O [2 marks]
What is the Big O of binary search, and why?
8Practice Task — Conditions for binary search [2 marks]
State the two conditions that must hold for binary search to be used.
9Practice Task — Doubling [3 marks]
A sorted array of 32 items is doubled to 64, then to 128. How many extra worst-case comparisons at each doubling? Justify.
10Practice Task — Find 4 bugs [4 marks]
The buggy pseudocode has 4 errors. Line 4: high=LENGTH(B). Line 5: WHILE low<high. Line 11: low=mid. Line 13: high=mid. Give the correction for each.
11Practice Task — / vs // [3 marks]
A student uses / instead of // for mid. Explain what goes wrong and state the correction.
12Practice Task — Missing base case [3 marks]
A recursive binary search forgets the IF low > high THEN RETURN -1 check. What happens when the target is not in the list?
13Practice Task — Pseudocode FindBook [6 marks]
Write pseudocode for FindBook(target) that performs binary search on global array Catalogue. Returns index or -1.
14Practice Task — Max comparisons [2 marks]
For 10,000 sorted items, what is the maximum number of comparisons in the worst case? Justify.
15Practice Task — Binary vs linear [2 marks]
Give one reason binary search is preferable for a large sorted catalogue, and one situation where linear search is better.

Question Bank

Answer all questions, then press Submit Quiz to see your score.

0/12 answered

Question 1Multiple Choice

What is the key requirement for binary search?

Question 2True / False

Binary search on 1,000,000 items needs at most ~20 comparisons.

Question 3Multiple Choice

What is the mid formula?

Question 4Multiple Choice

What is the loop condition?

Question 5True / False

When arr[mid] < target, set low = mid + 1 to search the right half.

Question 6Multiple Choice

What is the Big O of binary search?

Question 7Multiple Choice

What are the two conditions for binary search?

Question 8True / False

Using low = mid (instead of mid + 1) can cause an infinite loop.

Question 9Multiple Choice

What does recursive binary search return when not found?

Question 10Multiple Choice

Doubling the list from 32 to 64 adds how many extra comparisons?

Question 11True / False

Binary search is always better than linear search.

Question 12Multiple Choice

What is wrong with high = LENGTH(B) instead of LENGTH(B) - 1?

Answer all 12 questions to enable submission.