Lesson 29 of 66 6 min

Problem: Lowest Common Ancestor of a Binary Tree

Master the logic of result propagation in recursion. Learn how to identify the "Split Point" where two nodes diverge in a tree in O(N) time.

1. Problem Statement

Given a binary tree, find the lowest common ancestor (LCA) of two given nodes in the tree.

According to the definition of LCA on Wikipedia: “The lowest common ancestor is defined between two nodes p and q as the lowest node in T that has both p and q as descendants (where we allow a node to be a descendant of itself).”

Input: root = [3,5,1,6,2,0,8,null,null,7,4], p = 5, q = 1
Output: 3 (Node 3 is the parent of both 5 and 1).

2. The Mental Model: The "Result Propagation" Intuition

Imagine you are looking for two children, p and q, in a large multi-story building (the tree).

  1. You (the current node) ask your Left and Right sub-branches: "Did you find p or q?"
  2. If both branches say "Yes," you have found the Split Point. You are the Lowest Common Ancestor.
  3. If only one branch says "Yes," you pass that result up to your parent.
  4. If you are p or q yourself, you report "Found!" to your parent immediately.

3. Visual Execution (The Convergence)

graph TD
    Root((3)) --> L((5))
    Root --> R((1))
    L --> LL((6))
    L --> LR((2))
    
    subgraph "Recursive Bubble Up"
        LL -- null --> L
        LR -- found 4 --> L
        L -- found 5/4 --> Root
        R -- found 1 --> Root
        Root -- Both Sides Found! --> LCA[Return 3]
    end

4. Java Implementation (Optimal O(N))

public TreeNode lowestCommonAncestor(TreeNode root, TreeNode p, TreeNode q) {
    // 1. Base Case: If we hit a null or find one of the targets
    if (root == null || root == p || root == q) {
        return root;
    }

    // 2. Recurse: Ask children for results
    TreeNode left = lowestCommonAncestor(root.left, p, q);
    TreeNode right = lowestCommonAncestor(root.right, p, q);

    // 3. The "Aha!" Moment: 
    // If left is not null AND right is not null, this node is the LCA
    if (left != null && right != null) {
        return root;
    }

    // 4. Bubble Up: If only one side found a target, return that side
    return (left != null) ? left : right;
}

5. Verbal Interview Script (Staff Tier)

Interviewer: "What is the time and space complexity of this recursive solution?"

You: "The time complexity is $O(N)$ because, in the worst case, we must visit every node in the tree to locate p and q. The space complexity is $O(H)$, where $H$ is the tree height, due to the recursive call stack. In a balanced tree, this is $O(\log N)$, but for a skewed tree, it can degrade to $O(N)$. A key technical nuance to mention is the Base Case Short-Circuiting: by returning root as soon as we find p or q, we avoid exploring any subtrees below the first target we find. This is correct because if q were a descendant of p, then p would naturally be the LCA anyway."

6. Staff-Level Follow-Ups

Follow-up 1: "What if the nodes p and q are not guaranteed to exist in the tree?"

  • The Answer: "The current implementation would return whatever node it found (e.g., if only p exists, it returns p). To solve this, I would use a boolean flag or a counter to verify that both nodes were seen during the traversal. If the counter is less than 2 at the end, I would return null."

Follow-up 2: "How do you solve this in a Binary Search Tree (BST)?"

  • The Answer: "In a BST, we can solve this in $O(H)$ time and $O(1)$ space without a stack. We just compare the values of p and q to the current root. If both are smaller, move left. If both are larger, move right. The first node we hit that is between the values of p and q is the LCA."

7. Performance Nuances (The Java Perspective)

  1. Reference Equality: Note the use of root == p. In Java, this is a reference equality check (comparing memory addresses). This is extremely fast and is the correct way to identify specific node objects in a tree.
  2. Object Overhead: While recursion is elegant, for a tree with 1 million nodes, the stack overhead might be too much. In a production system, I'd consider using a Map to store Parent Pointers (BFS) and then finding the intersection of the two paths from p and q to the root.

6. Staff-Level Verbal Masterclass (Communication)

Interviewer: "How would you defend this specific implementation in a production review?"

You: "In a mission-critical environment, I prioritize the Big-O efficiency of the primary data path, but I also focus on the Predictability of the system. In this implementation, I chose a recursive approach with memoization. While a recursive solution is more readable, I would strictly monitor the stack depth. If this were to handle skewed inputs, I would immediately transition to an explicit stack on the heap to avoid a StackOverflowError. From a memory perspective, I leverage localized objects to ensure that we minimize the garbage collection pauses (Stop-the-world) that typically plague high-throughput Java applications."

7. Global Scale & Distributed Pivot

When a problem like this is moved from a single machine to a global distributed architecture, the constraints change fundamentally.

  1. Data Partitioning: We would shard the input space using Consistent Hashing. This ensures that even if our dataset grows to petabytes, any single query only hits a small subset of our cluster, maintaining logarithmic lookup times.
  2. State Consistency: For problems involving state updates (like DP or Caching), we would use a Distributed Consensus protocol like Raft or Paxos to ensure that all replicas agree on the final state, even in the event of a network partition (The P in CAP theorem).

8. Performance Nuances (The Staff Perspective)

  1. Cache Locality: Accessing a 2D matrix in row-major order (reading [i][j] then [i][j+1]) is significantly faster than column-major order in modern CPUs due to L1/L2 cache pre-fetching. I always structure my loops to align with how the memory is physically laid out.
  2. Autoboxing and Generics: In Java, using List<Integer> instead of int[] can be 3x slower due to the overhead of object headers and constant wrapping. For the most performance-sensitive sections of this algorithm, I advocate for primitive specialized structures.

Want to track your progress?

Sign in to save your progress, track completed lessons, and pick up where you left off.