1. Problem Statement
There is a new alien language that uses the English alphabet. However, the order of the letters is unknown to you. You are given a list of strings words from the alien language dictionary, where the strings in words are sorted lexicographically by the rules of this new language.
Return a string of the unique letters in the new alien language sorted in lexicographical increasing order. If there is no solution, return "".
Input: words = ["wrt","wrf","er","ett","rftt"]
Output: "wertf"
2. Approach: Graph Construction + Kahn's Algorithm
This is a Dependency Ordering problem. If "abc" comes before "abd", it means 'c' comes before 'd'. This is a directed edge: c -> d.
- Extract Dependencies: Compare adjacent words. Find the first mismatch character. That mismatch defines an edge in our graph.
- Handle Edge Cases: If "apple" comes before "app", it's invalid (prefix cannot be longer than the word). Return
"". - Topological Sort: Use Kahn's Algorithm (BFS) to find the order.
- Detect Cycles: If the result length is less than the number of unique characters, there was a cycle. Return
"".
3. Java Implementation
public String alienOrder(String[] words) {
Map<Character, Set<Character>> adj = new HashMap<>();
Map<Character, Integer> inDegree = new HashMap<>();
for (String w : words) {
for (char c : w.toCharArray()) inDegree.put(c, 0);
}
// 1. Build Graph
for (int i = 0; i < words.length - 1; i++) {
String w1 = words[i], w2 = words[i+1];
if (w1.length() > w2.length() && w1.startsWith(w2)) return "";
for (int j = 0; j < Math.min(w1.length(), w2.length()); j++) {
char c1 = w1.charAt(j), c2 = w2.charAt(j);
if (c1 != c2) {
if (!adj.getOrDefault(c1, new HashSet<>()).contains(c2)) {
adj.computeIfAbsent(c1, k -> new HashSet<>()).add(c2);
inDegree.put(c2, inDegree.get(c2) + 1);
}
break;
}
}
}
// 2. BFS (Kahn's)
StringBuilder sb = new StringBuilder();
Queue<Character> q = new LinkedList<>();
for (char c : inDegree.keySet()) if (inDegree.get(c) == 0) q.offer(c);
while (!q.isEmpty()) {
char curr = q.poll();
sb.append(curr);
if (adj.containsKey(curr)) {
for (char next : adj.get(curr)) {
inDegree.put(next, inDegree.get(next) - 1);
if (inDegree.get(next) == 0) q.offer(next);
}
}
}
return sb.length() == inDegree.size() ? sb.toString() : "";
}
4. 5-Minute "Video-Style" Walkthrough
- The "Aha!" Moment: Lexicographical order is just a sequence of "before/after" constraints. In computer science, "before/after" constraints are edges in a Directed Acyclic Graph (DAG).
- The Comparison: We only get information from the first character that differs. Comparing "wrt" and "wrf" only tells us 't' comes before 'f'. It tells us nothing about 'r' or 'w'.
- The Catch: If the dictionary is
["abc", "ab"], it's an impossible ordering. In any sorted dictionary, a prefix must come before its longer version.
5. Interview Discussion
- Interviewer: "What if there are multiple valid orders?"
- You: "The problem asks for any one of them. Topological sort naturally provides one valid linear ordering."
- Interviewer: "Complexity?"
- You: "O(C) where C is the total number of characters in all words. We look at every character at least once during word comparison and graph building."