Lesson 36 of 73 8 min

Trie (Prefix Tree) in Java: The Ultimate Data Structure for String Problems

Master the Trie (Prefix Tree) data structure in Java. Learn its intuition, implementation, dry runs, and time complexity analysis for autocomplete, spell checkers, and FAANG interviews.

Deep Dive: Trie Implementation

When you type a query into Google, how does it instantly suggest the rest of your sentence? When you type a misspelled word, how does your phone know it's wrong in milliseconds? The answer is often a Trie (pronounced "try").

A Trie, also known as a Prefix Tree, is a specialized tree-like data structure used to efficiently store and retrieve keys in a dataset of strings. Unlike binary search trees, nodes in a Trie do not store their associated key. Instead, a node's position in the tree defines the key with which it is associated.

In top tech interviews, Tries are the secret weapon for solving complex string problems that would otherwise timeout using standard HashMaps or Arrays.

When Should You Think About a Trie?

Consider using a Trie when:

  • You are dealing with a dictionary of words or a large set of strings.
  • The problem involves prefix matching (e.g., "find all words starting with 'auto'").
  • You need to implement autocomplete or a spell checker.
  • You are solving problems related to word search on a grid (like Boggle).
  • You need to find the longest common prefix among a set of strings.

Core Concept of a Trie

Imagine storing the words "cat", "car", "cap", and "dog".

In a Trie, the root node represents an empty string. The children of the root represent the first letters of all stored words ('c' and 'd'). The path from the root down to a specific node represents a prefix or a complete word.

Key properties:

  1. Nodes: Each node typically contains an array of pointers (or a HashMap) to its children, representing the alphabet (e.g., an array of size 26 for lowercase English letters).
  2. End of Word Marker: Because "car" is a word, the node representing 'r' must have a boolean flag isEndOfWord = true. This distinguishes a complete word from a mere prefix (like "ca").

Implementing a Trie in Java

Let's build a standard Trie that supports inserting a word, searching for a complete word, and checking if a prefix exists.

Step 1: The TrieNode Class

class TrieNode {
    // Array to hold children. Size 26 for 'a' to 'z'
    TrieNode[] children;
    // Flag to mark the end of a valid word
    boolean isEndOfWord;

    public TrieNode() {
        children = new TrieNode[26];
        isEndOfWord = false;
    }
}

Step 2: The Trie Class

public class Trie {
    private TrieNode root;

    public Trie() {
        root = new TrieNode();
    }

    // Inserts a word into the trie.
    public void insert(String word) {
        TrieNode current = root;
        for (int i = 0; i < word.length(); i++) {
            char ch = word.charAt(i);
            int index = ch - 'a'; // Calculate index (0 to 25)
            
            // If the character doesn't exist, create a new node
            if (current.children[index] == null) {
                current.children[index] = new TrieNode();
            }
            // Move down the tree
            current = current.children[index];
        }
        // Mark the final node as the end of a word
        current.isEndOfWord = true;
    }

    // Returns true if the word is in the trie.
    public boolean search(String word) {
        TrieNode current = root;
        for (int i = 0; i < word.length(); i++) {
            char ch = word.charAt(i);
            int index = ch - 'a';
            
            // If the path breaks, the word doesn't exist
            if (current.children[index] == null) {
                return false;
            }
            current = current.children[index];
        }
        // It's only a valid word if the final node is marked
        return current.isEndOfWord;
    }

    // Returns true if there is any word in the trie that starts with the given prefix.
    public boolean startsWith(String prefix) {
        TrieNode current = root;
        for (int i = 0; i < prefix.length(); i++) {
            char ch = prefix.charAt(i);
            int index = ch - 'a';
            
            if (current.children[index] == null) {
                return false;
            }
            current = current.children[index];
        }
        // If we successfully navigated the prefix, it exists
        return true;
    }
}

Complexity Analysis

  • Time Complexity:
    • Insert: $O(L)$, where $L$ is the length of the word. We do exactly $L$ operations.
    • Search/StartsWith: $O(L)$, where $L$ is the length of the word/prefix. This is blazingly fast and independent of how many words are stored in the Trie!
  • Space Complexity:
    • $O(N \times L \times 26)$ in the absolute worst case (where $N$ is the number of words, and there are no shared prefixes). However, Tries are highly space-efficient when words share common prefixes (which is typical for dictionaries).

Dry Run: Building a Trie

Let's insert "app" and "apple".

  1. Initialize: Create empty root.
  2. Insert "app":
    • 'a': root.children[0] is null. Create new node. Move to it.
    • 'p': current.children[15] is null. Create new node. Move to it.
    • 'p': current.children[15] is null. Create new node. Move to it.
    • End of string. Set current.isEndOfWord = true.
  3. Insert "apple":
    • 'a': root.children[0] exists! Move to it. (Shared prefix)
    • 'p': current.children[15] exists! Move to it. (Shared prefix)
    • 'p': current.children[15] exists! Move to it. (Shared prefix)
    • 'l': current.children[11] is null. Create new node. Move to it.
    • 'e': current.children[4] is null. Create new node. Move to it.
    • End of string. Set current.isEndOfWord = true.

Notice how "app" and "apple" share the first three nodes. This is the magic of the Prefix Tree.

Trie vs. HashMap

Why use a Trie when a HashMap<String, Boolean> offers $O(1)$ lookups?

  1. Prefix Searching: A HashMap cannot easily find all words starting with "auto-". You would have to iterate through all keys. A Trie does this in $O(L)$ time.
  2. Memory Efficiency for Prefixes: If you store 10,000 words starting with "pre", a HashMap stores the string "pre" 10,000 times in memory. A Trie stores the nodes 'p' $\rightarrow$ 'r' $\rightarrow$ 'e' exactly once.
  3. No Hash Collisions: Tries don't suffer from hash collisions, resulting in consistent worst-case performance.

Advanced Pattern: Trie + Backtracking (Word Search II)

The true power of Tries in interviews is combining them with Backtracking (DFS).

In problems like Word Search II (given a 2D board of characters and a list of words, find all words on the board), checking every path against a HashMap is too slow.

The Trie Optimization:

  1. Insert all the target words into a Trie.
  2. Perform a DFS from every cell on the board.
  3. Instead of passing a string down the DFS, pass the TrieNode.
  4. If at any point node.children[board[row][col] - 'a'] == null, you can immediately prune the DFS branch! This massive optimization is what gets you the "Hire" recommendation.

Common Mistakes

Mistake 1: Forgetting isEndOfWord

If you search for "cap", but you only inserted "capital", the traversal will successfully reach 'p'. If you don't check current.isEndOfWord, you will incorrectly return true.

Mistake 2: Hardcoding array size to 26

While a size 26 array works for lowercase English letters, some problems include uppercase, digits, or ASCII characters. In those cases, use TrieNode[] children = new TrieNode[128] (or 256), or use a HashMap<Character, TrieNode> children.

Example 2: Design Add and Search Words Data Structure

Support adding new words and searching for a string that may contain dots . where a dot can match any letter.

public boolean search(String word) {
    return searchInNode(word, 0, root);
}
private boolean searchInNode(String word, int index, TrieNode node) {
    if (index == word.length()) return node.isEndOfWord;
    char ch = word.charAt(index);
    if (ch != '.') {
        TrieNode child = node.children[ch - 'a'];
        return child != null && searchInNode(word, index + 1, child);
    } else {
        for (TrieNode child : node.children) {
            if (child != null && searchInNode(word, index + 1, child)) return true;
        }
    }
    return false;
}

Example 3: Longest Common Prefix

public String longestCommonPrefix(String[] strs) {
    if (strs == null || strs.length == 0) return "";
    Trie trie = new Trie();
    for (String s : strs) trie.insert(s);
    // Traverse trie until a node has >1 child or isEndOfWord
}

Practice Problems

  1. Implement Trie (Prefix Tree): (Medium)
  2. Design Add and Search Words Data Structure: (Medium)
  3. Word Search II: (Hard)
  4. Replace Words: (Medium)
  5. Maximum XOR of Two Numbers in an Array: (Medium)
  6. Longest Common Prefix: (Easy)
  7. Search Suggestions System: (Medium)
  8. Map Sum Pairs: (Medium)
  9. Stream of Characters: (Hard)
  10. Palindrome Pairs: (Hard)

Interview Script You Can Reuse

"Since we need to perform efficient prefix matching and lookups against a large dictionary of words, a Trie (Prefix Tree) is the ideal data structure. I'll build a Trie where each node contains an array of child pointers and a boolean flag to mark the end of valid words. This allows us to insert and search for words in O(L) time, where L is the length of the word, completely independent of the total number of words in our dictionary. It also saves space by sharing common prefixes."

Final Takeaways

  • Tries are trees specialized for strings and prefixes.
  • They excel at autocomplete, spell checking, and dictionary lookups.
  • Insertion and Search are strictly $O(L)$ time.
  • Combining a Trie with DFS/Backtracking is a master-tier optimization for matrix search problems.

Want to track your progress?

Sign in to save your progress, track completed lessons, and pick up where you left off.