Lesson 1 of 47 3 min

MANG Problem #8: Alien Dictionary (Hard)

Learn how to reconstruct the character order of an unknown language using Graph TopoSort and adjacency lists.

1. Problem Statement

There is a new alien language that uses the English alphabet. However, the order of the letters is unknown to you. You are given a list of strings words from the alien language dictionary, where the strings in words are sorted lexicographically by the rules of this new language.

Return a string of the unique letters in the new alien language sorted in lexicographical increasing order. If there is no solution, return "".

Input: words = ["wrt","wrf","er","ett","rftt"]
Output: "wertf"

2. Approach: Graph Construction + Kahn's Algorithm

This is a Dependency Ordering problem. If "abc" comes before "abd", it means 'c' comes before 'd'. This is a directed edge: c -> d.

  1. Extract Dependencies: Compare adjacent words. Find the first mismatch character. That mismatch defines an edge in our graph.
  2. Handle Edge Cases: If "apple" comes before "app", it's invalid (prefix cannot be longer than the word). Return "".
  3. Topological Sort: Use Kahn's Algorithm (BFS) to find the order.
  4. Detect Cycles: If the result length is less than the number of unique characters, there was a cycle. Return "".

3. Java Implementation

public String alienOrder(String[] words) {
    Map<Character, Set<Character>> adj = new HashMap<>();
    Map<Character, Integer> inDegree = new HashMap<>();
    for (String w : words) {
        for (char c : w.toCharArray()) inDegree.put(c, 0);
    }

    // 1. Build Graph
    for (int i = 0; i < words.length - 1; i++) {
        String w1 = words[i], w2 = words[i+1];
        if (w1.length() > w2.length() && w1.startsWith(w2)) return "";
        for (int j = 0; j < Math.min(w1.length(), w2.length()); j++) {
            char c1 = w1.charAt(j), c2 = w2.charAt(j);
            if (c1 != c2) {
                if (!adj.getOrDefault(c1, new HashSet<>()).contains(c2)) {
                    adj.computeIfAbsent(c1, k -> new HashSet<>()).add(c2);
                    inDegree.put(c2, inDegree.get(c2) + 1);
                }
                break;
            }
        }
    }

    // 2. BFS (Kahn's)
    StringBuilder sb = new StringBuilder();
    Queue<Character> q = new LinkedList<>();
    for (char c : inDegree.keySet()) if (inDegree.get(c) == 0) q.offer(c);

    while (!q.isEmpty()) {
        char curr = q.poll();
        sb.append(curr);
        if (adj.containsKey(curr)) {
            for (char next : adj.get(curr)) {
                inDegree.put(next, inDegree.get(next) - 1);
                if (inDegree.get(next) == 0) q.offer(next);
            }
        }
    }

    return sb.length() == inDegree.size() ? sb.toString() : "";
}

4. 5-Minute "Video-Style" Walkthrough

  1. The "Aha!" Moment: Lexicographical order is just a sequence of "before/after" constraints. In computer science, "before/after" constraints are edges in a Directed Acyclic Graph (DAG).
  2. The Comparison: We only get information from the first character that differs. Comparing "wrt" and "wrf" only tells us 't' comes before 'f'. It tells us nothing about 'r' or 'w'.
  3. The Catch: If the dictionary is ["abc", "ab"], it's an impossible ordering. In any sorted dictionary, a prefix must come before its longer version.

5. Interview Discussion

  • Interviewer: "What if there are multiple valid orders?"
  • You: "The problem asks for any one of them. Topological sort naturally provides one valid linear ordering."
  • Interviewer: "Complexity?"
  • You: "O(C) where C is the total number of characters in all words. We look at every character at least once during word comparison and graph building."

Want to track your progress?

Sign in to save your progress, track completed lessons, and pick up where you left off.