Advent of Code 2022 in pure TensorFlow - Day 11


In this article, we demonstrate how to solve problem 11 of the Advent of Code 2022 using pure TensorFlow. While TensorFlow is primarily known for its applications in deep learning and neural networks, it offers powerful and flexible tools for working with tensors and performing computations on them. The problem at hand revolves around the interactions of multiple monkeys inspecting items, making decisions based on their worry levels, and following a set of rules. By leveraging TensorFlow’s features, such as tf.TensorArray, tf.data.Dataset.scan, and tf.function, we can implement an efficient and elegant solution to this challenging puzzle. We will delve into the code, analyze different sections, and explain the rationale behind the various techniques used. Furthermore, this article provides insights into how TensorFlow can be employed for solving complex problems beyond its traditional applications in machine learning.

Day 11: Monkey in the Middle

In the Advent of Code 2022 problem 11, you are tasked with predicting the behavior of monkeys who have stolen your items. The monkeys operate based on your worry level for each item. The input consists of a description of multiple monkeys, including their starting items (represented by worry levels), an operation to modify the worry level, a test to decide where to throw the item, and the destination monkeys for both true and false test outcomes. The monkeys take turns inspecting and throwing items in rounds.

The input dataset looks like

Monkey 0:
  Starting items: 79, 98
  Operation: new = old * 19
  Test: divisible by 23
    If true: throw to monkey 2
    If false: throw to monkey 3

Monkey 1:
  Starting items: 54, 65, 75, 74
  Operation: new = old + 6
  Test: divisible by 19
    If true: throw to monkey 2
    If false: throw to monkey 0

[...]

Each monkey has several attributes: Starting items, Operation, Test, If true, and If false. In the example above, Monkey 0 starts with items that have worry levels of 79 and 98. The monkey modifies the worry level by multiplying it by 19. It then checks if the modified worry level is divisible by 23. If it is, Monkey 0 throws the item to Monkey 2; otherwise, it throws the item to Monkey 3.

The monkeys take turns inspecting and throwing items in rounds. In the first round, Monkey 0 inspects the first item with a worry level of 79. So a simulation of the first round for the Monkey 0 looks like

Monkey inspects an item with a worry level of 79.
Worry level is multiplied by 19 to 1501.
Monkey gets bored with item. Worry level is divided by 3 to 500.
Current worry level is not divisible by 23.
Item with worry level 500 is thrown to monkey 3.

The goal is to find the two most active monkeys after 20 rounds for part 1 and 20000 for part 2.

The level of monkey business is determined by multiplying the total number of times these two most active monkeys inspected items over the rounds.

Parsing the input

To begin, we read the input data using TensorFlow’s tf.data.TextLineDataset and concatenate it with an empty line. This way, we can detect when the dataset ends and reset the monkey state accordingly. We then create a dataset using the scan method to extract information about the monkeys and their operations. The init function is used to process all the input dataset and forwarding a state on each iteration, being used inside the scan method. We use the pos variable to to keep track of the read line, knowing that in the dataset every monkey information is separated by an empty line from the next monkey info.

dataset = tf.data.TextLineDataset(input_path.as_posix())
dataset = dataset.concatenate(tf.data.Dataset.from_tensors([""]))

monkey = tf.Variable(["", "", "", "", "", ""], dtype=tf.string)
monkey_id = tf.Variable(-1)
pos = tf.Variable(0)

initial_state = tf.constant(["", "", "", "", "", ""])

def init(old_state, line):

    if tf.equal(line, ""):
        monkey.assign(old_state, use_locking=True)
        pos.assign(0)
        return initial_state, True

    if tf.strings.regex_full_match(line, r"^Monkey \d*:$"):
        items = tf.strings.split(tf.strings.split([line], " ")[0][1], ":")[0]
        updates = [items]
    elif tf.equal(pos, 1):
        items = tf.strings.strip(tf.strings.split([line], ":")[0][1])
        updates = [items]
    elif tf.equal(pos, 2):
        op = tf.strings.strip(tf.strings.split([line], "="))[0][1]
        updates = [op]
    elif tf.equal(pos, 3):
        divisible_by = tf.strings.strip(tf.strings.split([line], " "))[0][-1]
        updates = [divisible_by]
    else:  # if tf.reduce_any([tf.equal(pos, 4), tf.equal(pos, 5)]):
        monkey_dest = tf.strings.strip(tf.strings.split([line], " "))[0][-1]
        updates = [monkey_dest]

    indices = tf.reshape(pos, (1, 1))
    new_state = tf.tensor_scatter_nd_update(old_state, indices, updates)
    pos.assign_add(1)

    return new_state, False

dataset = dataset.scan(initial_state, init)

Applying Operations

We define the apply_operation function to perform the specified operation on the worry level according to the monkey’s rules. This function takes the current worry level and the operation as inputs and returns the updated worry level after applying the operation.

@tf.function
def apply_operation(worry_level, op):
    op = tf.strings.split([op], " ")[0]  # lhs, op, rhs
    ret = tf.constant(0, tf.int64)
    # lhs always = "old"
    if tf.strings.regex_full_match(op[2], r"^\d*$"):
        val = tf.strings.to_number(op[2], tf.int64)
    else:
        val = worry_level
    if tf.equal(op[1], "+"):
        ret = worry_level + val
    if tf.equal(op[1], "*"):
        ret = worry_level * val

    return ret

Finding the Monkey Business

We create a function called monkey_play to simulate the monkeys’ actions for a given number of rounds. Inside this function, we loop through the dataset and perform the required operations for each monkey based on their rules. We also keep track of the number of items inspected by each monkey in the inspected_count variable.

The problem is the very same for both parts, the first time it asks to simulate the monkey behavior a small number of times, while the second part asks us to simulate it for 10000. This makes the “trivial” implementation computationally impossible, and we need to use a mathematical trick to find a way to make this problem solvable.

We use a tf.Variable to switch between the part 1 and part 2 inside the monkey_play function.

part = tf.Variable(1)

@tf.function
def monkey_play(rounds):
    items = tf.TensorArray(tf.int64, size=1, dynamic_size=True)
    operation = tf.TensorArray(tf.string, size=1, dynamic_size=True)
    divisible_test = tf.TensorArray(tf.int64, size=1, dynamic_size=True)
    throw_if_true = tf.TensorArray(tf.int32, size=1, dynamic_size=True)
    throw_if_false = tf.TensorArray(tf.int32, size=1, dynamic_size=True)

    for monkey_ready in dataset:
        if monkey_ready:
            idx = tf.strings.to_number(monkey[0], tf.int32)
            items = items.write(
                idx,
                tf.strings.to_number(tf.strings.split(monkey[1], ","), tf.int64),
            )
            operation = operation.write(idx, monkey[2])
            divisible_test = divisible_test.write(
                idx, tf.strings.to_number(monkey[3], tf.int64)
            )
            throw_if_true = throw_if_true.write(
                idx, tf.strings.to_number(monkey[4], tf.int32)
            )
            throw_if_false = throw_if_false.write(
                idx, tf.strings.to_number(monkey[5], tf.int32)
            )

    if tf.equal(part, 1):
        divisor = tf.constant(3, tf.int64)
    else:
        divisor = tf.reduce_prod(divisible_test.stack())

    for r in tf.range(rounds):
        # Now items contains all the starting items for every monkey
        # Let's play
        for m in tf.range(monkey_count):
            m_items = items.read(m)
            op = operation.read(m)
            test = divisible_test.read(m)

            for i in tf.range(tf.shape(m_items)[0]):
                worry_level = apply_operation(m_items[i], op)
                if tf.equal(part, 1):
                    worry_level //= divisor
                else:
                    worry_level = tf.math.mod(worry_level, divisor)

                if tf.equal(tf.math.mod(worry_level, test), 0):
                    dest = throw_if_true.read(m)
                else:
                    dest = throw_if_false.read(m)

                items = items.write(
                    dest,
                    tf.concat(
                        [items.read(dest), tf.expand_dims(worry_level, axis=0)],
                        axis=0,
                    ),
                )

                update = tf.tensor_scatter_nd_add(
                    inspected_count,
                    [[tf.cast(m, tf.int64)]],
                    [tf.constant(1, tf.int64)],
                )
                inspected_count.assign(update)

            items = items.write(m, [])

This function “simply” simulates the monkey business. Ignoring the separation between part 1 and part 2 (for now) let’s only focus on the intensive use of tf.TensorArray done inside this function.

Using tf.TensorArray

In our solution, we made use of tf.TensorArray, which is a mutable and dynamically-sized container for tensors. This data structure is useful when working with sequences of tensors that can change in size during runtime.

The reason we initialized the tf.TensorArray variables inside the @tf.function-decorated function instead of outside is related to the way TensorFlow traces and optimizes functions decorated with @tf.function. When a function is decorated with @tf.function, TensorFlow traces the computation defined by the function and generates a computation graph to optimize its execution. Initializing the tf.TensorArray variables outside the function would cause the traced computation to depend on external state, which can lead to unexpected behavior and difficulties in optimization.

By initializing the tf.TensorArray variables inside the decorated function, we ensure that the computation is self-contained, making it easier for TensorFlow to trace and optimize the computation. Additionally, this helps prevent potential issues that might arise due to external state changes during the lifetime of the TensorArray objects.

In summary, initializing tf.TensorArray variables inside the @tf.function-decorated function ensures a clean separation between the function’s internal computation and external state, allowing TensorFlow to effectively trace, optimize, and execute the computation.

Let’s go back to the solution…

monkey_play(20)
top_values, _ = tf.math.top_k(inspected_count, k=2)
monkey_business = tf.reduce_prod(top_values)
tf.print("Part 1: ", monkey_business)

After running the monkey_play function for 20 rounds, we find the top 2 values of the inspected_count variable and calculate their product, which gives us the answer to part 1 of the problem.

Part 2: Adapting for Larger Iterations

Honestly, I’ve spent some hour reasoning on how to make the second part to not computationally explode - I give up after a bit and I found a spoiler on reddit that clarified pretty well how to reason. I won’t report the spoiler here, but it’s worth reading to learn something new and well explained.

Anyway, the code of the monkey_play function already contains the implementation of the spoiler suggestion, therefore for us it’s just a matter of chaning the variable part and run the code for 20000 iterations as requested. Of course, since we have tf.Variable declared (correctly) outside of the @tf.function-decorated functions, we need to reset the state before proceeding (the first line of the following snipped does it).

inspected_count.assign(tf.zeros_like(inspected_count))
part.assign(2)
monkey_play(10000)
top_values, _ = tf.math.top_k(inspected_count, k=2)
monkey_business = tf.reduce_prod(top_values)
tf.print("Part 2: ", monkey_business)

Conclusion

You can the solution in folder 11 in the dedicated GitHub repository (in the 2022 folder): https://github.com/galeone/tf-aoc.

In this article, we demonstrated how to solve problem 11 of the Advent of Code 2022 using pure TensorFlow. By leveraging TensorFlow’s powerful features and its ability to work with tensors, we were able to efficiently solve both parts of the problem. This unconventional approach showcases the versatility of TensorFlow beyond its typical use in machine learning and deep learning applications. By exploring different techniques and libraries, we can develop creative and efficient solutions to various computational problems.

If you missed the article about the previous days’ solutions, here’s a handy list

For any feedback or comment, please use the Disqus form below - thanks!

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

Related Posts

Building a RAG for tabular data in Go with PostgreSQL & Gemini

In this article we explore how to combine a large language model (LLM) with a relational database to allow users to ask questions about their data in a natural way. It demonstrates a Retrieval-Augmented Generation (RAG) system built with Go that utilizes PostgreSQL and pgvector for data storage and retrieval. The provided code showcases the core functionalities. This is an overview of how the "chat with your data" feature of fitsleepinsights.app is being developed.

Using Gemini in a Go application: limits and details

This article explores using Gemini within Go applications via Vertex AI. We'll delve into the limitations encountered, including the model's context window size and regional restrictions. We'll also explore various methods for feeding data to Gemini, highlighting the challenges faced due to these limitations. Finally, we'll briefly introduce RAG (Retrieval-Augmented Generation) as a potential solution, but leave its implementation details for future exploration.

Custom model training & deployment on Google Cloud using Vertex AI in Go

This article shows a different approach to solving the same problem presented in the article AutoML pipeline for tabular data on VertexAI in Go. This time, instead of relying on AutoML we will define the model and the training job ourselves. This is a more advanced usage that allows the experienced machine learning practitioner to have full control on the pipeline from the model definition to the hardware to use for training and deploying. At the end of the article, we will also see how to use the deployed model. All of this, in Go and with the help of Python and Docker for the custom training job definition.

Integrating third-party libraries as Unreal Engine plugins: solving the ABI compatibility issues on Linux when the source code is available

In this article, we will discuss the challenges and potential issues that may arise during the integration process of a third-party library when the source code is available. It will provide guidance on how to handle the compilation and linking of the third-party library, manage dependencies, and resolve compatibility issues. We'll realize a plugin for redis plus plus as a real use case scenario, and we'll see how tough can it be to correctly compile the library for Unreal Engine - we'll solve every problem step by step.

AutoML pipeline for tabular data on VertexAI in Go

In this article, we delve into the development and deployment of tabular models using VertexAI and AutoML with Go, showcasing the actual Go code and sharing insights gained through trial & error and extensive Google research to overcome documentation limitations.

Advent of Code 2022 in pure TensorFlow - Day 12

Solving problem 12 of the AoC 2022 in pure TensorFlow is a great exercise in graph theory and more specifically in using the Breadth-First Search (BFS) algorithm. This problem requires working with a grid of characters representing a graph, and the BFS algorithm allows us to traverse the graph in the most efficient way to solve the problem.

Advent of Code 2022 in pure TensorFlow - Day 10

Solving problem 10 of the AoC 2022 in pure TensorFlow is an interesting challenge. This problem involves simulating a clock signal with varying frequencies and tracking the state of a signal-strength variable. TensorFlow's ability to handle complex data manipulations, control structures, and its @tf.function decorator for efficient execution makes it a fitting choice for tackling this problem. By utilizing TensorFlow's features such as Dataset transformations, efficient filtering, and tensor operations, we can create a clean and efficient solution to this intriguing puzzle.

Advent of Code 2022 in pure TensorFlow - Day 9

In this article, we'll show two different solutions to the Advent of Code 2022 day 9 problem. Both of them are purely TensorFlow solutions. The first one, more traditional, just implement a solution algorithm using only TensorFlow's primitive operations - of course, due to some TensorFlow limitations this solution will contain some details worth reading (e.g. using a pairing function for being able to use n-dimensional tf.Tensor as keys for a mutable hashmap). The second one, instead, demonstrates how a different interpretation of the problem paves the way to completely different solutions. In particular, this solution is Keras based and uses a multi-layer convolutional model for modeling the rope movements.

Advent of Code 2022 in pure TensorFlow - Day 8

Solving problem 8 of the AoC 2022 in pure TensorFlow is straightforward. After all, this problem requires working on a bi-dimensional grid and evaluating conditions by rows or columns. TensorFlow is perfectly suited for this kind of task thanks to its native support for reduction operators (tf.reduce) which are the natural choice for solving problems of this type.

Advent of Code 2022 in pure TensorFlow - Day 7

Solving problem 7 of the AoC 2022 in pure TensorFlow allows us to understand certain limitations of the framework. This problem requires a lot of string manipulation, and TensorFlow (especially in graph mode) is not only not easy to use when working with this data type, but also it has a set of limitations I'll present in the article. Additionally, the strings to work with in problem 7 are (Unix) paths. TensorFlow has zero support for working with paths, and thus for simplifying a part of the solution, I resorted to the pathlib Python module, thus not designing a completely pure TensorFlow solution.