Wordle-solving state of the art: all optimality results so far

2022-01-26 (last updated 2022-11-17)

Almost all mathematical questions one could have about Wordle are settled by now. I summarize here what is known so far.

tl;dr

If we exploit the 2309-words solution list:
- Wordle can be solved in 3.4201 guesses on average, with 5 guesses at worst (i.e., 100% of the time comfortably within the allowed 6 guesses).
- The corresponding decision tree is optimal both in terms of average, and in the worst case: It was proven that it is not possible to get an average lower than 3.4201, nor is it possible to have a worst case of 4 guesses, even independently.
- In hard mode, Wordle can be solved in 3.5076 guesses on average (with 6 guesses at worst, i.e. 100% of the time). Or, with a different decision tree, it can be solved with a slightly worse average, but always within 5 guesses. Again, both trees are optimal for their respective objective functions.
- Instead of using a decision tree, we can also consider the problem of solving the game starting with a fixed set of guesses independent of the game’s answers. This can be done with 7 total guesses (6 fixed guesses plus the winning guess), even if we only allow guess words from the 2309-words solution list. This result is optimal: it is impossible to play 5 fixed guesses, then always win Wordle with the sixth.
For the full 12972-words dictionary:
- Wordle can be solved in 6 guesses at worst (i.e. 100% of the time just within the allowed 6 guesses), with 4.07771 guesses on average.
- In terms of the worst case, the above result is optimal: It was proven that it is not possible to find a decision tree whose worst case is 5 guesses or less.
- In terms of the average, 4.07771 is not proven optimal, but there is strong evidence suggesting that it probably is.
- In hard mode, Wordle can be solved in 7 guesses worst, 4.52629 guesses on average. The worst case is optimal: Hard mode Wordle cannot be solved 100% of the time within 6 guesses. Moreover, the 4.52629 average cannot be improved while maintaining the 7-guesses worst case.
- Alternatively, we can play 10 fixed guesses, then always win with the 11th. This is optimal.
In terms of computational complexity:
- If we abstract away the dictionary and the alphabet (our input), finding an optimal strategy is NP-hard.
- However, if we fix the alphabet size, it can be done in polynomial time.

Note that the New York Times bought Wordle and tweaked the word lists on two occasions. Ultimately (since 2022-03-16), their changes amount to removing 6 words from the original solution list (now 2309 words), while retaining them as allowed guess words. The results above were updated for the latest word lists.

Update 2022-08-20: The NYT has added two words to the list of allowed words. While this should have limited impact on the results below, none of them have been updated for the change.

Game setup

First, let’s clarify a few things about the game:

Wordle comes with a dictionary of 12972 words that the player is allowed to use as guesses. They are essentially all 5-letter combinations one could reasonably argue are English words. The “secret” word that the player has to discover is also always in that dictionary.

It is so trivially easy to cheat at Wordle that there is no point to it: The list of secret words is right there in the code of the game. The daily secret words are picked sequentially from that list, so they are all known since 2021-06-19 (“cigar”) and until ~~2027-10-20~~ 2027-10-14, for a total of ~~2315~~ 2309 scheduled secret words.

One could always look up that list for the current date and solve the game immediately at the first guess. However, this is not much fun (mathematically), so we need to define the game somehow. There are two obvious options:
- “Full dictionary”: we consider that the secret word could be any word from the full 12972-word list.
- “Solution list”: we know that not every word can be a secret word, and take advantage of the knowledge of the list of 2309 scheduled secret word (but we ignore its order, which would trivially allow us to win every game in one guess).
While many focused on finding the best “first word” to play as a guess, the first guess is just a tiny part of what we really want: A decision tree. A decision tree not only tells you which guess to play first, but accompanies you as you play, telling you which guesses to play subsequently in function of the information given you by the game as you play (i.e. all the letter colors 🟩🟨⬜ so far). I only cover here fixed/deterministic decision trees, which will always reach a given secret word in the same way, because it is straightforward to give formal evaluations of how good they are. I will also briefly touch on approaches to solve the game without decision trees, with a fixed set of guesses.
The word optimal has taken quite a mauling lately in Wordle-related writings. Just to be to be absolutely clear, a solution is optimal only if no better solution can possibly exist. In particular, there is no such thing as a “more optimal” solution.
Now, regarding what we optimize, there are two natural choices:
- The worst-case number of guesses. That is, given a decision tree, what is the maximum number of guesses necessary to correctly guess any secret word? In other words, after how many guesses does the decision tree guarantee that we win the game?
- The average number of guesses. What is the average, over all possible secret words, of the number of guesses a decision tree will take to get a win?
As we will see, those are competing objectives, sometimes irreconcilably, sometimes not so.
Buried in the options, Wordle features a hard mode. The game is different in hard mode, in that only words that satisfy the hints given earlier are allowed as player guesses. (For example, if a letter is colored green 🟩 after one guess, that letter must be present in the same position in all subsequent guesses.)

There are subtleties about the rules regarding words with repeated letters, and this particularly affects hard mode. However, it seems that by now everyone agrees what the rules are.

Solution list: 2309 possible secret words

In this section, we assume that we know the list of 2309 scheduled secret words, but we ignore the list’s order (it would be trivially easy otherwise).

For the original word list, Cyrus Freshman set up a nice automated leader board with a standardized input format. There, we can directly find that fourteen distinct people have found decision trees that solve Wordle in 3.4212 average guesses and (simultaneously) 5 worst-case guesses (i.e., 100% of the time, never even using the 6th guess), including Jonathan Olson, Peter Tseng and Alex Selby. With the updated word list, we get an average of 3.4201 guesses, still with 5 guesses in the worst case.

While I suspect all employed similar general approaches (recursive enumeration / backtracking with tree pruning and some form of caching), I will now mainly defer to Alex Selby’s nice writeup, because it is the most thorough and because the corresponding code can answer more of our questions. Alex and Peter both implemented their algorithms in such a way that not only they get good decision trees, but they can also subsequently perform an accelerated complete enumeration and prove optimality (no tree has an average of less than 3.4212, and no tree wins all games in 4 guesses at most). Alex indicates that the latter is by far the most difficult part computationally: finding good trees (that happen to be optimal) is quick, proving that they are indeed optimal is much more expensive.

For hard mode, both Alex and Peter found a decision tree with in 3.5085 guesses on average and 6 guesses at worst (i.e., 100% wins in 6 guesses), with the original word list. Hard mode is interesting in that, if we are willing to sacrifice a bit on the average, we can improve the worst case: Alex found a different decision tree with 3.5448 guesses on average, but a worst case of 5 guesses! Again, Alex reports optimality on both fronts. With the updated word list, the average becomes 3.5076 with a worst case of 6, while again a worst case of 5 is possible.

Full dictionary: 12972 possible secret words

In this section, we assume that the game can pick secret words from its whole 12972-word dictionary. The number of potential secret words is now about 5.6x as large as in the previous case, dramatically increasing the size of an already large enumeration space.

Still, the same type of emumeration / backtracking tricks work even in this case, and with a careful implementation I found a decision tree with 6 guesses in the worst case. Even with the full dictionary of 12972 possible secret words, it is thus still possible to win every Wordle game within the 6 allowed guesses! I did not try to optimize for the average number of guesses, but it can still be measured: 4.2308 guesses on average over the 12972 words. I was not able to perform complete enumeration for 5 guesses, so there was no optimality guarantee. However, complete enumeration did rule out the existence of trees with 4 guesses in the worst case, leaving a frustrating gap between 4 and 6.

The frustration was short-lived, fortunately, as Alex Peattie proved that it is impossible to build a decision tree with 5 guesses in the worst case, making the result sharp. The proof is much more interesting than complete enumeration: it relies on cleverly finding provably-difficult-to-disambiguate word subsets and it can be checked directly by inspection without any computational tools (although Alex did double-check it with OR-Tools’ constraint programming solver).

Thanks of Alex’s proof, we now have an optimality guarantee in terms of the worst case at 6 guesses. It is still an open question whether we can find a decision tree that is optimal in terms of its average number of guesses.

Update 2022-02-06 Mark Fisher found a different decision tree with 6 guesses in the worst case. His approach is much less brute-force than mine and as a result he has nice insights about which words are difficult to disambiguate, and how to tackle such word groups.

Update 2022-04-08 Alex Selby improved his code (by two orders of magnitude!) and applied it to the full dictionary as well. This let him find a tree that solves wordle in 4.07771 average guesses (6 in the worst case). While there is no proof that 4.07771 is optimal, there is a good argument to be made that it probably is: Alex managed to show that if we fix the first guess word, this tree is the best one in terms of average number of guesses. Furthermore, this first guess word was deemed the most promising one, far ahead of the others, by a heuristic method (incomplete search) that reliably gave optimal trees in other cases.

For hard mode, Alex found a tree that solves wordle in 7 guesses in the worst case and proved by complete enumeration that 6 worst-case guesses is impossible! To give you an idea of the achievement here, my own code could only find a tree with 14 guesses in the worst case, let alone any form of proof. In addition, Alex proved that among trees with 7 guesses in the worst case, the best possible average number of guesses is 4.52629 (only leaving open the possibility of improving the average by sacrificing the worst case).

Fixed guesses

Added 2022-03-02. Instead of using a decision tree, one could consider the following problem. Let us assume that we will play, at first, a fixed set of guesses, blindly, without observing the game’s answers. Could we find such a set of guesses such that, afterwards, the answers immediately let us infer the secret word? This can be seen as an “offline” or “blindfold” variant of the solution approach.

Exploiting the 2309-words solution list, Alexandre Salle showed that it can be done with 8 fixed guesses plus the winning guess, so 9 guesses in total. It is unknown whether this is optimal. In particular, it is an open question whether a winning purely-offline strategy can exist within 6 guesses.

Update 2022-04-08: Virgile Andreani found a set of 6 fixed guesses that lets you win with the seventh, thus very close to systematically winning Wordle in 6 turns with fixed guesses!

Update 2022-05-12: David Rusin considered the problem of picking fixed guesses exclusively from the 2309-word solution list. Remarkably, despite the much more constrained setting, he also found a set of 6 fixed guesses.

Update 2022-05-18: Thanks to David’s work, we now know that the above results with 6 fixed guesses are optimal. It is thus impossible to solve Wordle within 6 guesses with the fixed-guesses approach (5 fixed guesses + 1 winning guess).

Update 2022-08-13: It turns out that David’s set of 6 words is the unique solution to the problem of solving Wordle after 6 fixed guesses from the 2309-word solution list. Using a MIP formulation, he showed that it is impossible to find such a set if any single one of those 6 words is disallowed as a guess.

Update 2022-08-20: If we do not exploit the 2309-words solution list, we need 10 fixed guesses plus the winning one, so 11 guesses in total. This is optimal.

Hybrid approaches

Added 2022-11-17. We know that it is not possible to single out the secret word after 5 fixed guesses. However, one may attempt to play fewer than 5 fixed guesses, then allow small decision trees afterwards to try and win Wordle within 6 guesses. That is exactly what Iouri Khramtsov proposes, with the four fixed guesses slant, price, dough, balmy, which allows us to always win Wordle by subsequently following small depth-2 decision trees. Another such option is bawdy, flung, porch, smite.

David Rusin subsequently improved on those fixed guesses in two distinct ways. First, he found that in this context, 3 fixed guesses are enough! Specifically, it is possible to start with 3 fixed guesses, then follow depth-2 decision trees, to always win Wordle within 5 total guesses. Through exhaustive search, he even determined that exactly 261 such sets of 3 fixed guesses exist. It is remarkable that 5 total guesses is achievable in such a restricted context, given that even using full decision trees, 5 guesses is optimal for Wordle in terms of tree depth.

The second improvement is more subtle. The depth-2 decision trees above can have an arbitrary number of nodes and can involve arbitrary guesses. In particular, such tree may even require playing guess words that are incompatible with the previous answers. Things would be easier for a human player if they could greedily play any of the remaining possible solutions (i.e., any word compatible with the previous answers), and be guaranteed to find the secret word within 6 guesses. David found that if we start with the 4 fixed guesses carve, downy, plumb, sight, it is indeed possible to win Wordle in all cases within 6 guesses by subsequently playing any of the remaining possible secret words. He also proved that this is unachievable with 3 fixed guesses.

Note that both improvements take place in a restricted context in which only potential secret words are allowed as guesses (there are only 2309 such words, as opposed to the full 12972-word dictionary of 5-letter combinations normally allowed as guesses).

Computational complexity

Added 2022-04-08 At FUN 2022, Daniel Lokshtanov and Bernardo Subercaseaux will present their paper on the computational complexity of Wordle.

In order to study the complexity of Wordle, we need to formalize the problem. In particular, we need to specify its input. Indeed, as envisioned by its creator, all the parameters of the game are fixed: The alphabet Σ contains 26 letters, the number of letters k in a word is 5, and the dictionary D contains 12972 words. We need to consider some or all of those as our input. Otherwise, from a complexity perspective, all the methods deployed above to find optimal decision trees are constant-time, O(1) algorithms. In particular |Σ| and k should not both be constant since |D| is upper-bounded by |Σ| to the power k.

Daniel and Bernardo show that even considering just Σ and D as our input (leaving k fixed to 5), as well as an integer ℓ, it is NP-hard to decide whether a decision tree with ℓ worst-case guesses exists. In that context, in particular, finding an optimal decision tree (one corresponding to the smallest value of ℓ for which the answer is yes) is NP-hard.

On the other hand, they also show that if |Σ| is constant, one can solve the above decision problem in polynomial time. They first construct a strategy that can always solve the game in |Σ| guesses. This handles all the cases in which ℓ ≥ |Σ|: the answer is always yes. Then, if ℓ < |Σ|, we have that ℓ is upper-bounded by a constant, essentially removing the exponential character of decision tree enumeration.

Laurent's notes