Advent of Code 2022 in pure TensorFlow - Day 11


In this article, we demonstrate how to solve problem 11 of the Advent of Code 2022 using pure TensorFlow. While TensorFlow is primarily known for its applications in deep learning and neural networks, it offers powerful and flexible tools for working with tensors and performing computations on them. The problem at hand revolves around the interactions of multiple monkeys inspecting items, making decisions based on their worry levels, and following a set of rules. By leveraging TensorFlow’s features, such as tf.TensorArray, tf.data.Dataset.scan, and tf.function, we can implement an efficient and elegant solution to this challenging puzzle. We will delve into the code, analyze different sections, and explain the rationale behind the various techniques used. Furthermore, this article provides insights into how TensorFlow can be employed for solving complex problems beyond its traditional applications in machine learning.

Day 11: Monkey in the Middle

In the Advent of Code 2022 problem 11, you are tasked with predicting the behavior of monkeys who have stolen your items. The monkeys operate based on your worry level for each item. The input consists of a description of multiple monkeys, including their starting items (represented by worry levels), an operation to modify the worry level, a test to decide where to throw the item, and the destination monkeys for both true and false test outcomes. The monkeys take turns inspecting and throwing items in rounds.

The input dataset looks like

Monkey 0:
  Starting items: 79, 98
  Operation: new = old * 19
  Test: divisible by 23
    If true: throw to monkey 2
    If false: throw to monkey 3

Monkey 1:
  Starting items: 54, 65, 75, 74
  Operation: new = old + 6
  Test: divisible by 19
    If true: throw to monkey 2
    If false: throw to monkey 0

[...]

Each monkey has several attributes: Starting items, Operation, Test, If true, and If false. In the example above, Monkey 0 starts with items that have worry levels of 79 and 98. The monkey modifies the worry level by multiplying it by 19. It then checks if the modified worry level is divisible by 23. If it is, Monkey 0 throws the item to Monkey 2; otherwise, it throws the item to Monkey 3.

The monkeys take turns inspecting and throwing items in rounds. In the first round, Monkey 0 inspects the first item with a worry level of 79. So a simulation of the first round for the Monkey 0 looks like

Monkey inspects an item with a worry level of 79.
Worry level is multiplied by 19 to 1501.
Monkey gets bored with item. Worry level is divided by 3 to 500.
Current worry level is not divisible by 23.
Item with worry level 500 is thrown to monkey 3.

The goal is to find the two most active monkeys after 20 rounds for part 1 and 20000 for part 2.

The level of monkey business is determined by multiplying the total number of times these two most active monkeys inspected items over the rounds.

Parsing the input

To begin, we read the input data using TensorFlow’s tf.data.TextLineDataset and concatenate it with an empty line. This way, we can detect when the dataset ends and reset the monkey state accordingly. We then create a dataset using the scan method to extract information about the monkeys and their operations. The init function is used to process all the input dataset and forwarding a state on each iteration, being used inside the scan method. We use the pos variable to to keep track of the read line, knowing that in the dataset every monkey information is separated by an empty line from the next monkey info.

dataset = tf.data.TextLineDataset(input_path.as_posix())
dataset = dataset.concatenate(tf.data.Dataset.from_tensors([""]))

monkey = tf.Variable(["", "", "", "", "", ""], dtype=tf.string)
monkey_id = tf.Variable(-1)
pos = tf.Variable(0)

initial_state = tf.constant(["", "", "", "", "", ""])

def init(old_state, line):

    if tf.equal(line, ""):
        monkey.assign(old_state, use_locking=True)
        pos.assign(0)
        return initial_state, True

    if tf.strings.regex_full_match(line, r"^Monkey \d*:$"):
        items = tf.strings.split(tf.strings.split([line], " ")[0][1], ":")[0]
        updates = [items]
    elif tf.equal(pos, 1):
        items = tf.strings.strip(tf.strings.split([line], ":")[0][1])
        updates = [items]
    elif tf.equal(pos, 2):
        op = tf.strings.strip(tf.strings.split([line], "="))[0][1]
        updates = [op]
    elif tf.equal(pos, 3):
        divisible_by = tf.strings.strip(tf.strings.split([line], " "))[0][-1]
        updates = [divisible_by]
    else:  # if tf.reduce_any([tf.equal(pos, 4), tf.equal(pos, 5)]):
        monkey_dest = tf.strings.strip(tf.strings.split([line], " "))[0][-1]
        updates = [monkey_dest]

    indices = tf.reshape(pos, (1, 1))
    new_state = tf.tensor_scatter_nd_update(old_state, indices, updates)
    pos.assign_add(1)

    return new_state, False

dataset = dataset.scan(initial_state, init)

Applying Operations

We define the apply_operation function to perform the specified operation on the worry level according to the monkey’s rules. This function takes the current worry level and the operation as inputs and returns the updated worry level after applying the operation.

@tf.function
def apply_operation(worry_level, op):
    op = tf.strings.split([op], " ")[0]  # lhs, op, rhs
    ret = tf.constant(0, tf.int64)
    # lhs always = "old"
    if tf.strings.regex_full_match(op[2], r"^\d*$"):
        val = tf.strings.to_number(op[2], tf.int64)
    else:
        val = worry_level
    if tf.equal(op[1], "+"):
        ret = worry_level + val
    if tf.equal(op[1], "*"):
        ret = worry_level * val

    return ret

Finding the Monkey Business

We create a function called monkey_play to simulate the monkeys’ actions for a given number of rounds. Inside this function, we loop through the dataset and perform the required operations for each monkey based on their rules. We also keep track of the number of items inspected by each monkey in the inspected_count variable.

The problem is the very same for both parts, the first time it asks to simulate the monkey behavior a small number of times, while the second part asks us to simulate it for 10000. This makes the “trivial” implementation computationally impossible, and we need to use a mathematical trick to find a way to make this problem solvable.

We use a tf.Variable to switch between the part 1 and part 2 inside the monkey_play function.

part = tf.Variable(1)

@tf.function
def monkey_play(rounds):
    items = tf.TensorArray(tf.int64, size=1, dynamic_size=True)
    operation = tf.TensorArray(tf.string, size=1, dynamic_size=True)
    divisible_test = tf.TensorArray(tf.int64, size=1, dynamic_size=True)
    throw_if_true = tf.TensorArray(tf.int32, size=1, dynamic_size=True)
    throw_if_false = tf.TensorArray(tf.int32, size=1, dynamic_size=True)

    for monkey_ready in dataset:
        if monkey_ready:
            idx = tf.strings.to_number(monkey[0], tf.int32)
            items = items.write(
                idx,
                tf.strings.to_number(tf.strings.split(monkey[1], ","), tf.int64),
            )
            operation = operation.write(idx, monkey[2])
            divisible_test = divisible_test.write(
                idx, tf.strings.to_number(monkey[3], tf.int64)
            )
            throw_if_true = throw_if_true.write(
                idx, tf.strings.to_number(monkey[4], tf.int32)
            )
            throw_if_false = throw_if_false.write(
                idx, tf.strings.to_number(monkey[5], tf.int32)
            )

    if tf.equal(part, 1):
        divisor = tf.constant(3, tf.int64)
    else:
        divisor = tf.reduce_prod(divisible_test.stack())

    for r in tf.range(rounds):
        # Now items contains all the starting items for every monkey
        # Let's play
        for m in tf.range(monkey_count):
            m_items = items.read(m)
            op = operation.read(m)
            test = divisible_test.read(m)

            for i in tf.range(tf.shape(m_items)[0]):
                worry_level = apply_operation(m_items[i], op)
                if tf.equal(part, 1):
                    worry_level //= divisor
                else:
                    worry_level = tf.math.mod(worry_level, divisor)

                if tf.equal(tf.math.mod(worry_level, test), 0):
                    dest = throw_if_true.read(m)
                else:
                    dest = throw_if_false.read(m)

                items = items.write(
                    dest,
                    tf.concat(
                        [items.read(dest), tf.expand_dims(worry_level, axis=0)],
                        axis=0,
                    ),
                )

                update = tf.tensor_scatter_nd_add(
                    inspected_count,
                    [[tf.cast(m, tf.int64)]],
                    [tf.constant(1, tf.int64)],
                )
                inspected_count.assign(update)

            items = items.write(m, [])

This function “simply” simulates the monkey business. Ignoring the separation between part 1 and part 2 (for now) let’s only focus on the intensive use of tf.TensorArray done inside this function.

Using tf.TensorArray

In our solution, we made use of tf.TensorArray, which is a mutable and dynamically-sized container for tensors. This data structure is useful when working with sequences of tensors that can change in size during runtime.

The reason we initialized the tf.TensorArray variables inside the @tf.function-decorated function instead of outside is related to the way TensorFlow traces and optimizes functions decorated with @tf.function. When a function is decorated with @tf.function, TensorFlow traces the computation defined by the function and generates a computation graph to optimize its execution. Initializing the tf.TensorArray variables outside the function would cause the traced computation to depend on external state, which can lead to unexpected behavior and difficulties in optimization.

By initializing the tf.TensorArray variables inside the decorated function, we ensure that the computation is self-contained, making it easier for TensorFlow to trace and optimize the computation. Additionally, this helps prevent potential issues that might arise due to external state changes during the lifetime of the TensorArray objects.

In summary, initializing tf.TensorArray variables inside the @tf.function-decorated function ensures a clean separation between the function’s internal computation and external state, allowing TensorFlow to effectively trace, optimize, and execute the computation.

Let’s go back to the solution…

monkey_play(20)
top_values, _ = tf.math.top_k(inspected_count, k=2)
monkey_business = tf.reduce_prod(top_values)
tf.print("Part 1: ", monkey_business)

After running the monkey_play function for 20 rounds, we find the top 2 values of the inspected_count variable and calculate their product, which gives us the answer to part 1 of the problem.

Part 2: Adapting for Larger Iterations

Honestly, I’ve spent some hour reasoning on how to make the second part to not computationally explode - I give up after a bit and I found a spoiler on reddit that clarified pretty well how to reason. I won’t report the spoiler here, but it’s worth reading to learn something new and well explained.

Anyway, the code of the monkey_play function already contains the implementation of the spoiler suggestion, therefore for us it’s just a matter of chaning the variable part and run the code for 20000 iterations as requested. Of course, since we have tf.Variable declared (correctly) outside of the @tf.function-decorated functions, we need to reset the state before proceeding (the first line of the following snipped does it).

inspected_count.assign(tf.zeros_like(inspected_count))
part.assign(2)
monkey_play(10000)
top_values, _ = tf.math.top_k(inspected_count, k=2)
monkey_business = tf.reduce_prod(top_values)
tf.print("Part 2: ", monkey_business)

Conclusion

You can the solution in folder 11 in the dedicated GitHub repository (in the 2022 folder): https://github.com/galeone/tf-aoc.

In this article, we demonstrated how to solve problem 11 of the Advent of Code 2022 using pure TensorFlow. By leveraging TensorFlow’s powerful features and its ability to work with tensors, we were able to efficiently solve both parts of the problem. This unconventional approach showcases the versatility of TensorFlow beyond its typical use in machine learning and deep learning applications. By exploring different techniques and libraries, we can develop creative and efficient solutions to various computational problems.

If you missed the article about the previous days’ solutions, here’s a handy list

For any feedback or comment, please use the Disqus form below - thanks!

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

Related Posts

Advent of Code 2022 in pure TensorFlow - Day 12

Solving problem 12 of the AoC 2022 in pure TensorFlow is a great exercise in graph theory and more specifically in using the Breadth-First Search (BFS) algorithm. This problem requires working with a grid of characters representing a graph, and the BFS algorithm allows us to traverse the graph in the most efficient way to solve the problem.

Advent of Code 2022 in pure TensorFlow - Day 10

Solving problem 10 of the AoC 2022 in pure TensorFlow is an interesting challenge. This problem involves simulating a clock signal with varying frequencies and tracking the state of a signal-strength variable. TensorFlow's ability to handle complex data manipulations, control structures, and its @tf.function decorator for efficient execution makes it a fitting choice for tackling this problem. By utilizing TensorFlow's features such as Dataset transformations, efficient filtering, and tensor operations, we can create a clean and efficient solution to this intriguing puzzle.

Advent of Code 2022 in pure TensorFlow - Day 9

In this article, we'll show two different solutions to the Advent of Code 2022 day 9 problem. Both of them are purely TensorFlow solutions. The first one, more traditional, just implement a solution algorithm using only TensorFlow's primitive operations - of course, due to some TensorFlow limitations this solution will contain some details worth reading (e.g. using a pairing function for being able to use n-dimensional tf.Tensor as keys for a mutable hashmap). The second one, instead, demonstrates how a different interpretation of the problem paves the way to completely different solutions. In particular, this solution is Keras based and uses a multi-layer convolutional model for modeling the rope movements.

Advent of Code 2022 in pure TensorFlow - Day 8

Solving problem 8 of the AoC 2022 in pure TensorFlow is straightforward. After all, this problem requires working on a bi-dimensional grid and evaluating conditions by rows or columns. TensorFlow is perfectly suited for this kind of task thanks to its native support for reduction operators (tf.reduce) which are the natural choice for solving problems of this type.

Advent of Code 2022 in pure TensorFlow - Day 7

Solving problem 7 of the AoC 2022 in pure TensorFlow allows us to understand certain limitations of the framework. This problem requires a lot of string manipulation, and TensorFlow (especially in graph mode) is not only not easy to use when working with this data type, but also it has a set of limitations I'll present in the article. Additionally, the strings to work with in problem 7 are (Unix) paths. TensorFlow has zero support for working with paths, and thus for simplifying a part of the solution, I resorted to the pathlib Python module, thus not designing a completely pure TensorFlow solution.

Advent of Code 2022 in pure TensorFlow - Day 6

Solving problem 6 of the AoC 2022 in pure TensorFlow allows us to understand how powerful this framework can be. In particular, problem 6 can be solved with a highly efficient and parallel solution, using just a single feature of tf.data.Dataset: interleave.

Advent of Code 2022 in pure TensorFlow - Day 5

In the first part of the article, I'll explain the solution that solves completely both parts of the puzzle. As usual, focusing on the TensorFlow features used during the solution and all the various technical details worth explaining. In the second part, instead, I'll propose a potential alternative solution to the problem that uses a tf.Variable with an undefined shape. This is a feature of tf.Variable that's not clearly documented and, thus, widely used. So, at the end of this article, we'll understand how to solve the day 5 problem in pure TensorFlow and also have an idea of how to re-design the solution using a tf.Variable with the validate_shape argument set to False.

Advent of Code 2022 in pure TensorFlow - Days 3 & 4

The solutions in pure TensorFlow I designed for days 3 and 4 are both completely based upon the tf.data.Dataset object. In fact, both problems can be seen as the streaming manipulation of the data that's being read from an input dataset.

Advent of Code 2022 in pure TensorFlow - Days 1 & 2

Let's start a tradition. This is the second year in a row I try to solve the Advent of Code (AoC) puzzles using only TensorFlow. This article contains the description of the solutions of the Advent of Code puzzles 1 and 2, in pure TensorFlow.

Integrating third-party libraries as Unreal Engine plugins: ABI compatibility and Linux toolchain

The Unreal Build Tool (UBT) official documentation explains how to integrate a third-party library into Unreal Engine projects in a very broad way without focusing on the real problems that are (very) likely to occur while integrating the library. In particular, when the third-party library is a pre-built binary there are low-level details that must be known and that are likely to cause troubles during the integration - or even make it impossible!