Whether your resolution was to go to the gym three times a week or write a blog post once per week, they usually all end in the same:

Broken.

It’s not that I haven’t been coding or continuing to learn. I’ve done a lot of coding at work these days using Phaser and solving puzzles over at CodinGame and FreeCodeCamp. Writing doesn’t quite ignite my soul the way solving a problem does.

I’ve taken a breather from React at the moment to talk about one of the most recent problems I solved over at CodinGame: Scrabble. Yup, that Scrabble. The crux of the problem is that you’re given seven letters and you need to find the best word you can make that’s present in the given dictionary. “Best” is defined here is the word worth the most points. How does a trie come into play here? First, what is a trie, specifically an R-way trie.

A trie is a tree where the pointers to other nodes are string/character-based. Certain nodes have a value. In the context of the Scrabble problem, when a node has a value then you’ve found a legal word. Otherwise the value is null.

Tries have typical operations such as put/insert, get, and delete. We’ll use insert to build up our structure of valid words.

First, the above is what a node in a trie looks like, nothing special. keys is just an Object. A trie is just as simple. Before we actually populate it with nodes, it’s just a root node.

The magic happens when we build our trie using put/insert. (I chose to call it insert.)

Hopefully my comments are pretty explanatory as to how the code works. To visualize this, watch this video. Watch the whole video if you can. It’s kinda dry but informative. Certainly better at explaining than I am.

TIL the name trie comes from the word retrieval but is pronounced “try” to not confuse it with trees.

The last operation we’ll need to be able to perform on our trie is to retrieve the last node for a given string. I called this getNode(). So, given the call getNode(‘code’), assuming this word exists in the trie, it should return the node that ‘e’ points too. Whether this has a value or not is a different concern.

Referring back to the Scrabble problem, after building up our trie with the provided valid words, we’ll use our new getNode function to help build up a list of possible words we can make from the seven letters in our hand. I called this function findWordsAndScore.

We have to provide the function with:

  • char: the current letter we’re checking for,
  • allLetters: the seven letters we were given,
  • trie: our trie of valid words,
  • curNode: the current node we’re at,
  • builtWrod: the current word we’ll creating during our recursion,
  • foundWords: an object to hold the valid words we’ve found and that words score.

Phew.

For our initial call, char and builtWord are both empty strings, and curNode is null or it could be trie.root. The function then loops through each letter and starts a search from the current node for the next letter. It only starts a new recursive call if the current node has a pointer to the next letter’s node. If at any point we hit a node where its value isn’t null, then we add it to the foundWords object and store the score. Finally, with all possible words found, we just find the “best”. This means the word with the highest score, and if two words have the same score, then the word that appears earlier in the provided dictionary.

Neat, right? But why is this better in terms of time than going through every word in our dictionary and seeing if the letters we have can make that word?

Disclaimer: I’m still learning to evaluate algorithms in terms of time and space and I’m not a mathematician. Explanation guaranteed to not be scientific.

A naïve approach would be to test our letters against every word in the dictionary. If there’s n words in the dictionary and seven letters with which to form words then for each word in the dictionary we’d have to loop through our letters to see if we’re able to make the word. We can write this in Big-O notation as O(7Cn). C is some constant number of operations for checking if one of our letters exists in the word. It’s most likely an indexOf call followed by a splice if the letter exists. Either way, it’s basically linear time meaning as our dictionary of words gets larger so does our time to figure out what words we can make. (If it’s not linear time then please correct me!)

By utilizing a trie, we’re guaranteed to take no longer than O(7!). This worst case scenario would only happen if we could form words with every single remaining letter after we recurse. Even if somehow that happened, it would only take us 5040. In a normal Scrabble dictionary, there’s over 100,000 words. So we can see pretty easily how much time we’re saving. Even better, if we just want to check if a word is valid, our worst-case becomes O(m), where m is the length of the word. Imagine having to loop through the entire dictionary each time you wanted to validate a word…

I’ve only gone over the time complexities and haven’t covered the space complexities. I’ll save that for another day.

Until next time!