Advent of Code 2020 - Day 15

Rambunctious Recitation is number game that follows the rules of the Van Eck sequence. Here’s a great video from the Numberphile channel on YouTube about it. And here start our troubles because it seems no one has yet found an efficient algorithm for generating this sequence. We will really need to brute force it.

How can we resolve this predicament with the tools available to us?

If you want to follow along please download AdventOfCode2020.pbit and check Day 1 post for instructions.

Part 1

To start we need to find the 2020th number of the sequence. The rules are quite simple, starting from the last number we need to check if it was the first time the number was spoken and if it was the next number is zero, otherwise it will be the difference between the current turn and the last time it was spoken.

Turn 1: 0
Turn 2: 3
Turn 3: 6       <- start position
Turn 4: 0       <- first time 6 was spoken
Turn 5: 4-1 = 3 <- difference between last two turns where 0 was spoken
Turn 6: 5-2 = 3 <- difference between last two turns 3 was spoken
Turn 7: 6-5 = 1 <- difference between last two turns 3 was spoken
Turn 8: 0       <- first time 1 was spoken
....

DAX isn’t a solution here, we could easily generate the numbers but we wouldn’t be able to update the last positions and recursion it’s not available in the language.

We already know that tail recursion will be a problem if we try to the @function trick with Power Query. We are left with the trusty List.Generate.

To this be efficient we need to think about the way we are going to store our numbers. In each iteration we will need to check if it was the first time our last number was spoken and if not we also need to check the last two times it was spoken. A list seems the best option, we use the positions as the spoken numbers and their values will store the turns. Looking again at our example:

//Start
{1,  ,  , 2,  ,  , 3} <- list of turns
 0, 1, 2, 3, 4, 5, 6  <- index

This seems more efficient, if the index of the number is empty then we know it’s the first time it was spoken. If it isn’t empty we just need to subtract the current turn with the value on that index.

Translating this to code:

Init = {2, 0, 6, 12, 1, 3},
//Loop to rotate the initial list
l = List.Repeat({null}, List.Max(Init)),
loop1 =
    List.Generate(
        () => 0,
        each _ <= List.Max(Init),
        each _ + 1,
        each if List.PositionOf(Init,_) <> -1
            then List.PositionOf(Init,_) + 1
            else null),
//Loop that iterates through the turns
ll = loop1 & List.Repeat({null}, 2020 - List.Count(loop1)),
  acc0 = List.Generate(
      () => [n=List.Count(Init), lp = List.Last(Init), lst=List.Buffer(ll)],
      each [n] <= 2020,
      each
      [
          n = [n] + 1,
          lp = let lpp = [lst]{[lp]} in if lpp <> null then [n] - lpp else 0,
          lst = List.ReplaceRange([lst], [lp], 1, {[n]})
      ]
      , each [[lp]]
  )
//Get the answer
res = List.Last(ll[lp])

Seems more complicated than it is. the first loop is just to exchange the numbers spoken with the index of the list and the second one is the one that calculates our final value.

In each step we check if the current position is null, first time the number is spoken and in that case we do a replace in the lst changing that index to zero. In case the number was previously said we update the position with the difference between the current turn and the value of the last turn it was spoken.

Then we obtain the last position of the record as our answer.

Part 2

The 30000000th position? Are they kidding? I’ve tried the same code of before but after 10 minutes I gave up on waiting. How can you solve this now without having to wait an eternity?

It seems it’s time to unleash one of the two secret weapons of Power Query. The ability of running Python and R scripts.

Even though we have a performance penalty by using an external language, I am sure that this will be faster than generating thirty million lists of thirty million records just to get the last position.

I chose Python to solve this just because it was already installed in my system. If you don’t have it here’s a nice guide to setup it with Power BI: https://carldesouza.com/installing-setting-up-python-for-power-bi/.

To calculate the result we will translate the same idea to Python:

# 'dataset' holds the input data for this script
import pandas as pd

dataset['idx'] = range(0, len(dataset))

dict = dataset.set_index('Number')['idx'].to_dict()

lastPosition = dataset['Number'].iloc[-1]

for i in range(len(dataset), 30000000):
    if lastPosition in dict:
        nextNumber = (i-1) - dict[lastPosition]
    else:
        nextNumber = 0

    dict[lastPosition] = i-1
    lastPosition = nextNumber

dataset = pd.DataFrame()
dataset['LastNumber'] = [lastPosition]

Power BI exposes the last step of Power Query to Python in a object called \(dataset\).

This will have our data that we transform in a dictionary where the key is the spoken numbers and the value the turn.

Following the same strategy as before, while looping we check if the number exists in the dictionary, if it does we assign the difference between the current turn and the last turn it was spoken otherwise we create an entry for the number with the current turn value.

Then we clean the \(dataset\) object and assign it the last number calculated.

Conclusion

While we were able to abuse Power Query by making it do things for that it wasn’t designed in this series, the second part of this puzzle showed us it’s limits. While it’s great for what it was designed, building queries that pull, mash up and transform data. If we we need to write back to the data sources, create files on the system, send emails or do something like any other typical general purpose language we are going to get stuck.

It’s functional nature that forbids us of reassigning values to the variables kills it here. To get to the last value we can’t modify and pass around the list with the positions like we did with the dictionary in each iteration. List.Generate will create a new list on each loop.

I’m glad they let us have an escape for these scenarios where we can use other language to do our biding.

There’s another way you can use Python and R in Power BI. You can build custom visuals for your data. Maybe I can write about it someday.

Happy Easter!!

 Share!