Trie (Prefix Tree)
A Trie (pronounced "try", from the middle letters of "retrieval") is a tree data structure designed to efficiently store and search strings. It is also called a prefix tree because its structure naturally groups words by their shared prefixes. Unlike binary search trees, which store complete values in each node, a trie encodes strings as paths: each node represents a single character, and following a path from root to a marked node spells out a complete word.
Why Use a Trie?#
Imagine you're building an autocomplete feature for a search engine. When a user types "app", you need to quickly find all words starting with "app" (apple, application, apply, etc.). A trie makes this incredibly efficient:
| Operation | Trie | Hash Table | Binary Search Tree |
|---|---|---|---|
| Search word | O(m) | O(m) | O(m log n) |
| Insert word | O(m) | O(m) | O(m log n) |
| Find all words with prefix | O(p + k) | O(n) | O(n) |
| Spell check suggestions | O(p + k) | O(n) | O(n) |
Where m = word length, n = total number of words in the data structure, p = prefix length, k = number of matching words.
The key advantage is prefix-aware search. Hash tables can look up an exact word in O(m), but finding every word that shares a given prefix requires scanning all stored entries — O(n). A trie navigates directly to the prefix endpoint in O(p) time and then only visits the matching subtree.
Trie Structure#
Each character of a stored word occupies one level in the trie. Words that share a common prefix follow the same path from the root, which is why tries are space-efficient for related word sets such as a dictionary or a domain-specific vocabulary.
Green nodes mark the end of a valid word; uncolored nodes are intermediate characters that form part of a prefix but are not words on their own. The path root → a → p → p spells "app", and root → c → a → t spells "cat". Notice how "apple" and "apply" share the path a → p → p → l before diverging at their final character — this shared structure is what makes a trie memory-efficient for similar words.
Trie Node
Each node contains a map of children (character to child node) and a boolean flag marking whether a complete word ends here. The flag is essential: without it, there is no way to distinguish a stored word from a prefix of a longer word that happens to follow the same path.
Insert Operation#
To insert a word, traverse the trie character by character. At each step, if the required child node does not yet exist, create it. After processing all characters, mark the final node as end-of-word. Crucially, existing nodes along shared prefixes are never duplicated — inserting "apple" after "app" reuses the root → a → p → p path and only adds new nodes for l and e.
In the step diagrams above, yellow highlights the newly created end-of-word node — the last character of the word just inserted. Green marks end-of-word nodes from previous insertions. Intermediate new nodes are shown without color.
Trie Insert
Insert a word into the trie by creating nodes for each character. m = length of the word being inserted.
Search Operation#
To search for a word, traverse the trie following each character in sequence. If at any point the required character is absent from the current node's children, the word is not in the trie and we return false immediately. Only if every character is traversed successfully and the final node is marked as end-of-word do we confirm the word exists.
"appl" is not found because the node 'l' is not marked as end-of-word. The path exists — it is part of the word "apple" — but "appl" itself was never inserted. This illustrates the essential role of the end-of-word flag: a trie stores paths, and the flag is the only thing that distinguishes a complete word from an unfinished prefix along those paths.
Trie Search
Search for an exact word match in the trie. m = length of the word being searched.
Prefix Search#
Check if any word in the trie begins with a given prefix. The traversal is identical to search, with one difference: we skip the end-of-word check at the end. If we can successfully traverse every character of the prefix, the path exists — and any path in a trie represents at least one stored word — so we return true immediately.
Prefix Search
Check if any word in the trie starts with the given prefix. p = length of the prefix.
Get All Words with Prefix#
A powerful trie operation is finding every word that begins with a given prefix — the foundation of autocomplete. The algorithm works in two phases: first, navigate to the end of the prefix (identical to starts_with); then, run a depth-first search (DFS) from that node to collect every word in the subtree below it.
In the diagram, yellow nodes trace the prefix path (a → p); green highlights the entire subtree visited by the DFS starting from the prefix endpoint. Every node marked ✓ within that subtree contributes a word to the result.
Result: ["app", "apple", "apply"]
Autocomplete - Get Words with Prefix
Find all words in the trie that start with a given prefix. p = prefix length, k = total characters across all matching words, m = maximum word length (recursion depth).
Delete Operation#
Deleting from a trie is more complex than insertion because removing a word must not disturb other words that share its prefix. There are two goals:
- Unmark the end-of-word flag at the deleted word's final node
- Remove nodes that are no longer needed — those with no remaining children that are not the endpoint of any other word
Node 'e' is removed because it has no children once "apple" is deleted, making it unreachable from any remaining word. Node 'l' is kept because it still has 'y' as a child, which is required by the word "apply". The algorithm recurses up the path after deletion, pruning each node that is now both childless and not the endpoint of another word — preventing memory leaks from dangling, unreachable nodes.
Trie Delete
Delete a word from the trie, removing unnecessary nodes. m = length of the word being deleted.
Count Words with Prefix#
A useful operation is counting how many words share a given prefix — without the overhead of collecting those words. This is more efficient than get_words_with_prefix when you need only the count, such as displaying "23 results" in a search interface before the user requests the full list.
Count Words with Prefix
Count the number of words that start with a given prefix. p = prefix length, n = number of nodes in the subtree, h = subtree height (recursion depth).
Complete Implementation#
Complete Trie Class
Full implementation with all common operations.
Applications#
| Application | How Trie Helps | Example |
|---|---|---|
| Autocomplete | Fast prefix matching for suggestions | Search engines, IDEs, messaging apps |
| Spell Checker | Find similar words, suggest corrections | Word processors, browsers |
| IP Routing | Longest prefix matching for routing tables | Network routers |
| Word Games | Validate words, find possible moves | Scrabble, Wordle solvers |
| DNA Sequencing | Pattern matching in genetic data | Bioinformatics |
| Phone Directory | Contact search by name prefix | Mobile phone apps |
| Text Prediction | Predict next word based on prefix | Keyboard apps |
| Command History | Search previous commands by prefix | Terminal shells |
Autocomplete System Example#
Autocomplete with Frequency
Trie with word frequency tracking for ranked suggestions.
Complexity Analysis#
| Operation | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Insert | O(m) | O(m) | m = word length |
| Search | O(m) | O(1) | No extra space needed |
| Delete | O(m) | O(m) | Recursion stack |
| Prefix search | O(p) | O(1) | p = prefix length |
| Get all with prefix | O(p + k) | O(m) | k = total chars in results |
| Count with prefix | O(p + n) | O(h) | n = nodes in subtree |
Time complexity is dominated by the word or prefix length — tries never examine unrelated branches. The O(m) space for Delete and the O(h) space for Count reflect the depth of the recursion call stack, not any additional data structures.
Space Complexity of Trie#
| Factor | Impact | Optimization |
|---|---|---|
| Alphabet size | More children per node = more space | Use hash map instead of array |
| Number of words | More words = more nodes | Use compressed trie |
| Common prefixes | Shared prefixes save space | Tries excel when words share prefixes |
| Word length | Longer words = deeper tree | Consider suffix compression |
A compressed trie (also called a Radix tree) merges chains of single-child nodes into one node that stores the full substring, reducing the total node count for sparse word sets. This is an optimization worth knowing, though a standard trie is the right starting point.
Reviewing AI-Generated Code#
Trie operations have two recurring failure points that AI assistants regularly get wrong: the end-of-word flag and the character-to-node mapping. Focusing your review on these areas will catch most bugs before they reach production.
| Issue | What to Check | Common AI Mistake |
|---|---|---|
| End-of-word flag | Mark nodes that complete a word | Forgetting to set or check the flag |
| Search vs startsWith | Search requires end-of-word, startsWith doesn't | Confusing the two operations |
| Delete cleanup | Remove unused nodes after delete | Leaving orphan nodes |
| Character mapping | Use dict/map for flexibility | Fixed-size array breaks on special chars |
| Empty string | Handle empty string edge case | Crashing or wrong result for empty input |
Spotting AI Trie Bugs
Common errors in AI-generated trie code
Questions to Ask AI About Trie Code#
- "What's the difference between search and startsWith?" - search requires end-of-word check; startsWith does not
- "What character set does this support?" - fixed arrays limit to specific character ranges; use dict/map for flexibility
- "What happens after deleting a word?" - unused nodes should be cleaned up to free memory
- "How does it handle 'app' vs 'apple'?" - both can coexist; the end-of-word flag distinguishes complete words from prefixes
Summary#
| Concept | Key Point |
|---|---|
| Structure | Tree where each path from root spells a word |
| Node contents | Children map (char → node) + end-of-word flag |
| Key advantage | O(p) prefix operations, which hash tables cannot efficiently support |
| Space trade-off | More space than hash table, but enables prefix queries |
| Best for | Autocomplete, spell check, IP routing, word games |
| Optimization | Compressed trie reduces space for sparse data |
| Common operations | Insert O(m), Search O(m), StartsWith O(p), Delete O(m) |