Laurent's notes

Wordle-solving state of the art: all optimality results so far

Almost all mathematical questions one could have about Wordle are settled by now. I summarize here what is known so far.

tl;dr

Note that the New York Times bought Wordle and tweaked the word lists on two occasions. Ultimately (since 2022-03-16), their changes amount to removing 6 words from the original solution list (now 2309 words), while retaining them as allowed guess words. The results above were updated for the latest word lists.

Update 2022-08-20: The NYT has added two words to the list of allowed words. While this should have limited impact on the results below, none of them have been updated for the change.

Game setup

First, let’s clarify a few things about the game:

Solution list: 2309 possible secret words

In this section, we assume that we know the list of 2309 scheduled secret words, but we ignore the list’s order (it would be trivially easy otherwise).

For the original word list, Cyrus Freshman set up a nice automated leader board with a standardized input format. There, we can directly find that fourteen distinct people have found decision trees that solve Wordle in 3.4212 average guesses and (simultaneously) 5 worst-case guesses (i.e., 100% of the time, never even using the 6th guess), including Jonathan Olson, Peter Tseng and Alex Selby. With the updated word list, we get an average of 3.4201 guesses, still with 5 guesses in the worst case.

While I suspect all employed similar general approaches (recursive enumeration / backtracking with tree pruning and some form of caching), I will now mainly defer to Alex Selby’s nice writeup, because it is the most thorough and because the corresponding code can answer more of our questions. Alex and Peter both implemented their algorithms in such a way that not only they get good decision trees, but they can also subsequently perform an accelerated complete enumeration and prove optimality (no tree has an average of less than 3.4212, and no tree wins all games in 4 guesses at most). Alex indicates that the latter is by far the most difficult part computationally: finding good trees (that happen to be optimal) is quick, proving that they are indeed optimal is much more expensive.

For hard mode, both Alex and Peter found a decision tree with in 3.5085 guesses on average and 6 guesses at worst (i.e., 100% wins in 6 guesses), with the original word list. Hard mode is interesting in that, if we are willing to sacrifice a bit on the average, we can improve the worst case: Alex found a different decision tree with 3.5448 guesses on average, but a worst case of 5 guesses! Again, Alex reports optimality on both fronts. With the updated word list, the average becomes 3.5076 with a worst case of 6, while again a worst case of 5 is possible.

Full dictionary: 12972 possible secret words

In this section, we assume that the game can pick secret words from its whole 12972-word dictionary. The number of potential secret words is now about 5.6x as large as in the previous case, dramatically increasing the size of an already large enumeration space.

Still, the same type of emumeration / backtracking tricks work even in this case, and with a careful implementation I found a decision tree with 6 guesses in the worst case. Even with the full dictionary of 12972 possible secret words, it is thus still possible to win every Wordle game within the 6 allowed guesses! I did not try to optimize for the average number of guesses, but it can still be measured: 4.2308 guesses on average over the 12972 words. I was not able to perform complete enumeration for 5 guesses, so there was no optimality guarantee. However, complete enumeration did rule out the existence of trees with 4 guesses in the worst case, leaving a frustrating gap between 4 and 6.

The frustration was short-lived, fortunately, as Alex Peattie proved that it is impossible to build a decision tree with 5 guesses in the worst case, making the result sharp. The proof is much more interesting than complete enumeration: it relies on cleverly finding provably-difficult-to-disambiguate word subsets and it can be checked directly by inspection without any computational tools (although Alex did double-check it with OR-Tools’ constraint programming solver).

Thanks of Alex’s proof, we now have an optimality guarantee in terms of the worst case at 6 guesses. It is still an open question whether we can find a decision tree that is optimal in terms of its average number of guesses.

Update 2022-02-06 Mark Fisher found a different decision tree with 6 guesses in the worst case. His approach is much less brute-force than mine and as a result he has nice insights about which words are difficult to disambiguate, and how to tackle such word groups.

Update 2022-04-08 Alex Selby improved his code (by two orders of magnitude!) and applied it to the full dictionary as well. This let him find a tree that solves wordle in 4.07771 average guesses (6 in the worst case). While there is no proof that 4.07771 is optimal, there is a good argument to be made that it probably is: Alex managed to show that if we fix the first guess word, this tree is the best one in terms of average number of guesses. Furthermore, this first guess word was deemed the most promising one, far ahead of the others, by a heuristic method (incomplete search) that reliably gave optimal trees in other cases.

For hard mode, Alex found a tree that solves wordle in 7 guesses in the worst case and proved by complete enumeration that 6 worst-case guesses is impossible! To give you an idea of the achievement here, my own code could only find a tree with 14 guesses in the worst case, let alone any form of proof. In addition, Alex proved that among trees with 7 guesses in the worst case, the best possible average number of guesses is 4.52629 (only leaving open the possibility of improving the average by sacrificing the worst case).

Fixed guesses

Added 2022-03-02. Instead of using a decision tree, one could consider the following problem. Let us assume that we will play, at first, a fixed set of guesses, blindly, without observing the game’s answers. Could we find such a set of guesses such that, afterwards, the answers immediately let us infer the secret word? This can be seen as an “offline” or “blindfold” variant of the solution approach.

Exploiting the 2309-words solution list, Alexandre Salle showed that it can be done with 8 fixed guesses plus the winning guess, so 9 guesses in total. It is unknown whether this is optimal. In particular, it is an open question whether a winning purely-offline strategy can exist within 6 guesses.

Update 2022-04-08: Virgile Andreani found a set of 6 fixed guesses that lets you win with the seventh, thus very close to systematically winning Wordle in 6 turns with fixed guesses!

Update 2022-05-12: David Rusin considered the problem of picking fixed guesses exclusively from the 2309-word solution list. Remarkably, despite the much more constrained setting, he also found a set of 6 fixed guesses.

Update 2022-05-18: Thanks to David’s work, we now know that the above results with 6 fixed guesses are optimal. It is thus impossible to solve Wordle within 6 guesses with the fixed-guesses approach (5 fixed guesses + 1 winning guess).

The next best thing, however, is still possible: win 100% of games with a hybrid approach of 4 initial fixed guesses, then tiny 2-deep decision trees. For that, Iouri Khramtsov proposes slant, price, dough, balmy while David Rusin found carve, sight, downy, plumb.

Update 2022-08-13: It turns out that David’s set of 6 words is the unique solution to the problem of solving Wordle after 6 fixed guesses from the 2309-word solution list. Using a MIP formulation, he showed that it is impossible to find such a set if any single one of those 6 words is disallowed as a guess.

Update 2022-08-20: If we do not exploit the 2309-words solution list, we need 10 fixed guesses plus the winning one, so 11 guesses in total. This is optimal.

Computational complexity

Added 2022-04-08 At FUN 2022, Daniel Lokshtanov and Bernardo Subercaseaux will present their paper on the computational complexity of Wordle.

In order to study the complexity of Wordle, we need to formalize the problem. In particular, we need to specify its input. Indeed, as envisioned by its creator, all the parameters of the game are fixed: The alphabet Σ contains 26 letters, the number of letters k in a word is 5, and the dictionary D contains 12972 words. We need to consider some or all of those as our input. Otherwise, from a complexity perspective, all the methods deployed above to find optimal decision trees are constant-time, O(1) algorithms. In particular |Σ| and k should not both be constant since |D| is upper-bounded by |Σ| to the power k.

Daniel and Bernardo show that even considering just Σ and D as our input (leaving k fixed to 5), as well as an integer , it is NP-hard to decide whether a decision tree with worst-case guesses exists. In that context, in particular, finding an optimal decision tree (one corresponding to the smallest value of for which the answer is yes) is NP-hard.

On the other hand, they also show that if |Σ| is constant, one can solve the above decision problem in polynomial time. They first construct a strategy that can always solve the game in |Σ| guesses. This handles all the cases in which ℓ ≥ |Σ|: the answer is always yes. Then, if ℓ < |Σ|, we have that is upper-bounded by a constant, essentially removing the exponential character of decision tree enumeration.