Abstract Data Types Using OOP

ADT Dictionary

Key–value storage for quick lookups and flexible data — a dictionary (associative array / map) stores pairs where each unique key maps to one value, and you fetch by key, not by position. Covers the built-in Python dictionary, wrapping it as a clean ADT (add/get/remove/contains), building one from two parallel arrays (linear search), how a hash function turns a key into an index (with collisions), and when a dictionary beats a list.

30.1 Introduction: The Dictionary ADT

Every structure you have met so far is reached by a number: arrays by index, stacks by position, trees by following pointers. A dictionary drops the number. Instead, each item is stored under a key — a label you choose — and you fetch the item by handing back that same key. The thing stored is the value. Together they form a key–value pair.

Think of student records: you rarely want “the 7th student”, you want “the student whose ID is S1042”. The ID is the key; the student's record is the value. Because keys are unique, looking one up returns exactly one value — and with hashing behind the scenes, it returns it almost instantly, no matter how many students are stored.

By the end of this page you will use the built-in dictionary, wrap it as a tidy ADT, build one yourself from plain arrays, and explain how a hash function turns a key into a storage position.

What this page covers
  • A dictionary is also called an associative array or a map.
  • The efficiency theory (why a good hash gives near-constant look-up) connects to the searching chapter.
TermMeaning
keyThe unique label used to store and find a value. No two pairs may share a key.
valueThe data stored against a key. Values may repeat; keys may not.
key–value pairOne entry in the dictionary: a key together with the value it maps to.
associative arrayAnother name for a dictionary — you associate each value with a key rather than an index.
look-upRetrieving the value for a given key. The core operation a dictionary is built to do well.
hash functionA calculation that turns a key into an array index, so the value can be stored and found directly.
hash tableThe underlying array that a hashed dictionary stores its pairs in.
collisionWhen two different keys hash to the same index. The dictionary must have a plan to handle this.

30.2 Keys, Values & Associative Arrays

Why are we doing this?
  • Before any code, you must be able to read a dictionary diagram and say which value a key maps to.
  • Cambridge questions frequently show a small set of pairs and ask you to state a value, or to explain why a key must be unique.
  • Get the picture right and the code is straightforward.

Below, Musarrat stores the marks of her revision group. Each name is a key; each mark is the value it maps to. To read it you never count positions — you simply follow the arrow from a key to its value.

A dictionary maps each unique key (amber) to one value (green)

"Aymaan"
"Mehrin"
"Nazifa"
"Naib"
72
85
64
85

Notice "Mehrin" and "Naib" both map to 85 — values may repeat, keys may not.

Key rule
  • A key is unique and maps to exactly one value.
  • Storing a pair whose key already exists does not add a second entry — it overwrites the old value.
  • Values, by contrast, may appear any number of times.
Exam tip: If a question asks you to “justify using a dictionary” rather than a list, the marking point is almost always direct look-up by a meaningful key — you do not have to search through every element to find the one you want.

30.3 The Built-in Python Dictionary

Why are we doing this?
  • Python gives you a dictionary built in, written with curly braces.
  • Knowing the five everyday operations — add, look up, update, delete and test membership — means you can answer most “use a dictionary” questions immediately, and it is the base you will wrap into an ADT next.
marks = {"Aymaan": 72, "Mehrin": 85}   # create with two pairs
marks["Nazifa"] = 64                    # ADD a new pair
print(marks["Mehrin"])                  # LOOK UP a value -> 85
marks["Aymaan"] = 90                    # UPDATE (key exists -> overwrite)
del marks["Nazifa"]                     # DELETE a pair
if "Mehrin" in marks:                   # MEMBERSHIP test -> True
    print("found")

Notice that add and update use exactly the same line, marks[key] = value. Python decides which it is by whether the key already exists. That is the rule from Section 2 in action: a repeated key overwrites rather than duplicating.

Task — Worked Example — Trace dictionary operations
Starting from {"Aymaan":72, "Mehrin":85}, trace: d["Naib"]=64; d["Mehrin"]=90; del d["Aymaan"].
Hint:
  • Line 1: add Naib:64 (new key). Line 2: key exists → update Mehrin to 90. Line 3: remove Aymaan.
Your Turn — Your Turn — Trace dictionary operations
Starting from {"Tarnima":50, "Mohar":50}, trace: d["Adri"]=70; d["Mohar"]=60; then look up d["Tarnima"].
Hint:
  • The second line updates, it does not add.
Key rule
  • Looking up a key that is not present raises an error.
  • Guard a look-up with a membership test (if key in d) or use d.get(key), which returns None instead of crashing.

30.4 Wrapping it as a Dictionary ADT

Why are we doing this?
  • Wrapping the built-in dictionary in a class gives it clear, named operations — add, get, remove, contains — and hides the raw {} behind them.
  • This is the “encapsulation” point: code outside the class never touches the data directly, it asks through methods.
class Dictionary:
    def __init__(self):
        self.items = {}                 # the private store

    def add(self, key, value):
        self.items[key] = value         # add or overwrite

    def get(self, key):
        if key in self.items:
            return self.items[key]      # found
        return None                     # not present -> safe answer

    def remove(self, key):
        if key in self.items:
            del self.items[key]

    def contains(self, key):
        return key in self.items        # True / False

Each method is one obvious job. get guards the look-up so it returns None instead of crashing — the safer behaviour. Because everything goes through these four methods, you could later swap the inner {} for your own hash table without changing any code that uses the dictionary.

Key rule — one ADT from another
  • This Dictionary is built on top of Python's built-in dictionary type.
  • That is a legitimate “implement an ADT from a built-in type” answer — but examiners often want the deeper version: building it from arrays (Section 5) or a hash table (Section 6).

30.5 A Dictionary from Two Parallel Arrays

Why are we doing this?
  • Mark scheme: Cambridge frequently asks you to implement one ADT “from another ADT or from built-in types”.
  • For a dictionary, the standard answer is two parallel arrays — one for keys, one for values — where position i in each lines up.
  • Look-up then becomes a linear search of the keys array.

The idea: Keys[i] and Values[i] belong together. To find a value you search Keys for the target; its index then points to the matching value.

IndexKeysValues
0"Aymaan"72
1"Mehrin"85
2"Nazifa"64

Search Keys for “Mehrin” → index 1 → Values[1] = 85.

def lookup(self, target):
    for i in range(len(self.keys)):    # linear search
        if self.keys[i] == target:
            return self.values[i]      # lined-up value
    return None                        # key not found

This works, but every look-up may scan the whole keys array — slow once there are thousands of pairs. That weakness is exactly the problem hashing solves in the next section.

Task — Worked Example — Look up 'Nazifa'
Keys = ["Aymaan","Mehrin","Nazifa"], Values = [72,85,64]. Look up "Nazifa".
Hint:
  • Linear search Keys; return Values at the same index.
Your Turn — Your Turn — Look up 'Naib'
Using the same arrays, look up "Naib". What does lookup return, and how many comparisons happen?
Hint:
  • The loop only stops early on a match.

30.6 Hashing — Turning a Key into an Index

Why are we doing this?
  • Linear search is slow.
  • A hash function calculates the storage index directly from the key, so look-up jumps straight to the right slot instead of scanning.
  • This is the idea that makes a real dictionary fast, and the syllabus asks you to “show understanding of how a hashing algorithm is used to locate an item”.

A hash function takes a key and returns an array index. A simple one for numeric IDs is index = key MOD tableSize. Below, IDs are stored in a table of size 7, so ID 52 lands at 52 MOD 7 = 3.

Hash function key MOD 7 sends ID 52 straight to slot 3 — no searching needed

52
→ 52 MOD 7 = 3 →
0
1
2
52
4
5
6
def hash_index(self, key):
    return key % self.size            # key MOD tableSize

def add(self, key, value):
    i = self.hash_index(key)
    self.table[i] = (key, value)      # store the pair at its slot

The catch: two different keys can hash to the same slot — a collision. For example 52 MOD 7 = 3 and 59 MOD 7 = 3 both want slot 3. A dictionary must handle this, commonly by open addressing (try the next free slot) or chaining (each slot holds a small list of pairs).

Task — Worked Example — Hash ID 40
Table size 7, hash = key MOD 7. Where does ID 40 go?
Hint:
  • Compute 40 MOD 7 (7×5 = 35, remainder ?).
Your Turn — Your Turn — Hash ID 59 (collision)
Same table. Compute the slot for ID 59, then say which earlier ID it collides with.
Hint:
  • 7 × 8 = 56, so 59 MOD 7 = ?
Key rule
  • A good hash function spreads keys evenly so collisions are rare; then look-up is close to instant.
  • A bad one piles many keys into the same slot, and performance collapses back toward a linear search.
Exam tip:
  • If asked what happens “if the hash returns a slot already in use”, the two accepted answers are store in the next free slot (open addressing) or store in a list at that slot (chaining).
  • Naming either, plus the word “collision”, secures the marks.

30.7 Dictionary vs List — When to Use Which

Why are we doing this?
  • “Evaluate” and “justify” questions reward a clear comparison.
  • The deciding question is simple: do you reach your data by a meaningful label or by position / order?
  • Knowing which structure fits which job is a frequent 3-mark discussion point.
Use a dictionary when…Use a list/array when…
You look items up by a unique key (ID, name, code).Order or position matters, or you process items in sequence.
Look-up speed matters more than ordering.You need the data sorted, or to keep duplicates.
Each key maps to exactly one value.The same value may legitimately appear many times in order.
Key rule — one ADT from another
  • A dictionary can be implemented from built-in types (Python's {}), from parallel arrays with linear search, or from a hash table.
  • The hash table is what makes look-up fast and is the answer examiners most want to see.

30.8 Full Exam-Style Question

Paper 4 · ADT dictionary + hashing · ~12 marks
  • moshikur.com stores student records in a dictionary, using each student's unique numeric ID as the key and their record as the value.
  • The dictionary is implemented using a hash table of size 11 with the hash function index = ID MOD 11.

(a) [2] State two features of a dictionary ADT. (AO1)

(b) [2] Explain why the ID, rather than the student's name, is a good choice of key. (AO2)

(c) [3] The IDs 23, 45, 34 are added. Calculate the slot each is stored in and state whether a collision occurs. (AO2)

(d) [3] Complete the pseudocode for Lookup(ID) that returns the record, or a null pointer if the slot is empty, assuming no collisions. (AO2)

FUNCTION Lookup(ID) RETURNS Record
    Index <- ....................
    IF HashTable[Index] = .................... THEN
        RETURN NullRecord
    ELSE
        RETURN ....................
    ENDIF
ENDFUNCTION

(e) [2] Explain what a collision is and state one way the dictionary could handle it. (AO3)

Lab Task — Show mark scheme
Reveal the mark scheme for parts (a)–(e).
Hint:
  • (a) 2 from: key–value pairs; unique keys; retrieved by key not position; unordered.
  • (b) ID is unique (name could be shared); numeric so it hashes directly.
  • (c) 23 MOD 11 = 1; 45 MOD 11 = 1 (collides with 23); 34 MOD 11 = 1 (also collides — all three in slot 1).
  • (d) Index <- ID MOD 11; IF HashTable[Index] = NullRecord; RETURN HashTable[Index].
  • (e) Collision = two keys hash to same index; handle by open addressing (next free slot) or chaining (list at slot).

Key Points Summary

A dictionary stores key–value pairs. Each unique key maps to exactly one value. You fetch items by their key (a label), not by a position.
A key is unique — storing a pair whose key already exists overwrites the old value. Values may repeat; keys may not.
A dictionary is also called an associative array or a map — you associate each value with a key rather than an index.
The five built-in operations: d[key]=value (add/update), d[key] (look up), del d[key] (delete), key in d (membership), d.get(key) (safe look-up).
Looking up a missing key with d[key] raises an error. Guard with if key in d, or use d.get(key) which returns None instead of crashing.
Add and update use the same line: d[key] = value. Python decides which by whether the key already exists (a repeated key overwrites).
Wrapping the built-in {} in a class (add/get/remove/contains) is encapsulation — code outside never touches the data directly, it asks through methods.
The Dictionary ADT class: add(key,value) stores; get(key) returns the value or None; remove(key) deletes; contains(key) returns True/False.
A dictionary can be implemented from two parallel arrays: Keys[] and Values[]. Look-up is a linear search of Keys; return Values at the same index.
The two-array model is slow — every look-up may scan the whole keys array. A missing key always costs a full scan (worst case for linear search).
A hash function turns a key into an array index directly: index = key MOD tableSize. Look-up jumps straight to the right slot — no scanning.
A collision is when two different keys hash to the same index. E.g. 52 MOD 7 = 3 and 59 MOD 7 = 3 both want slot 3.
Two ways to handle a collision: open addressing (try the next free slot) or chaining (each slot holds a small list of pairs).
A good hash function spreads keys evenly so collisions are rare; then look-up is close to instant. A bad one collapses performance toward linear search.
Use a dictionary when you look items up by a unique key and look-up speed matters. Use a list when order/position matters or you need the data sorted.
A dictionary can be implemented from built-in types ({}), parallel arrays (linear search), or a hash table. The hash table is what makes look-up fast.

30.9 Practice Tasks

Fifteen exam-style tasks. Click Hint for bullet-point guidance, then Help to reveal a worked Python/pseudocode solution.

1Practice Task — Describe a dictionary [2 marks]
State two features of a dictionary ADT.
2Practice Task — Built-in dictionary operations [5 marks]
Given d = {"Aymaan":72, "Mehrin":85}, write lines to: add "Nazifa":64, look up "Mehrin", update "Aymaan" to 90, delete "Nazifa", test if "Mehrin" is in d.
3Practice Task — Trace dictionary operations [3 marks]
Starting from {"Tarnima":50, "Mohar":50}, trace: d["Adri"]=70; d["Mohar"]=60; d["Tarnima"]. What is the final state and what does d["Tarnima"] return?
4Practice Task — Dictionary ADT class [6 marks]
Write a Dictionary class with __init__ (items={}), add(key,value), get(key) returning None if absent, remove(key), and contains(key) returning True/False.
5Practice Task — Two-array look-up [4 marks]
Write a lookup(self, target) method for a dictionary built from self.keys and self.values arrays. Linear search keys; return the value at the same index, or None.
6Practice Task — Trace two-array look-up [2 marks]
Keys = ["Aymaan","Mehrin","Nazifa"], Values = [72,85,64]. Look up "Nazifa". What is returned, and how many comparisons?
7Practice Task — Missing key in two-array model [2 marks]
Same arrays. Look up "Naib". What does lookup return, and how many comparisons?
8Practice Task — Hash function [3 marks]
Write a hash_index(self, key) method that returns key MOD self.size, and an add(self, key, value) that stores the pair at the hashed slot.
9Practice Task — Compute hash index [2 marks]
Table size 7, hash = key MOD 7. Where does ID 40 go?
10Practice Task — Hash collision [3 marks]
Table size 7, hash = key MOD 7. ID 52 is at slot 3. Compute the slot for ID 59 and say which earlier ID it collides with.
11Practice Task — What is a collision? [2 marks]
Explain what a collision is and state one way the dictionary could handle it.
12Practice Task — Dictionary vs list [2 marks]
State when you should use a dictionary instead of a list.
13Practice Task — Why ID not name as key? [2 marks]
Explain why a student's unique ID, rather than their name, is a good choice of key.
14Practice Task — Pseudocode Lookup [3 marks]
Write Cambridge pseudocode for FUNCTION Lookup(ID) RETURNS Record using a hash table (index = ID MOD 11), returning NullRecord if the slot is empty.
15Practice Task — Full exam-style: student hash table [12 marks]
Student records in a hash table (size 11, index = ID MOD 11). (a) State 2 features of a dictionary [2]. (b) Why ID not name as key [2]. (c) IDs 23,45,34 added — calculate slots + collisions [3]. (d) Complete Lookup(ID) pseudocode [3]. (e) What is a collision + one way to handle it [2].

Question Bank

Answer all questions, then press Submit Quiz to see your score.

0/12 answered

Question 1Multiple Choice

A dictionary stores:

Question 2Multiple Choice

What happens if you store a pair whose key already exists?

Question 3True / False

Keys in a dictionary must be unique.

Question 4Multiple Choice

How do you add/update a pair in a Python dictionary?

Question 5Multiple Choice

What does d.get(key) return if the key is not present?

Question 6Multiple Choice

In the two-array model, how does look-up work?

Question 7Multiple Choice

What does a hash function do?

Question 8True / False

A collision is when two different keys hash to the same index.

Question 9Multiple Choice

Table size 7, hash = key MOD 7. Where does ID 40 go?

Question 10Multiple Choice

What are the two common ways to handle a collision?

Question 11Multiple Choice

When should you use a dictionary instead of a list?

Question 12Multiple Choice

A dictionary can be implemented from:

Answer all 12 questions to enable submission.