Advent of Code 2021 in pure TensorFlow - day 11


The Day 11 problem has lots in common with Day 9. In fact, will re-use some computer vision concepts like the pixel neighborhood, and we’ll be able to solve both parts in pure TensorFlow by using only a tf.queue as a support data structure.

Day 11: Dumbo Octopus

You can click on the title above to read the full text of the puzzle. The TLDR version is:

Our dataset is a grid of numbers. Every number represents an energy level of an octopus. On every time step, every energy level increases by 1 unit, and when the energy level goes beyond 9, the octopus flashes. When an octopus flashes it increases the energy level of all (in every direction) the octopus in the neighborhood. This may trigger a cascade of flashes. Finally, any octopus that flashed during this time step, rests its energy level to 0.

For example:

Before any steps:
11111
19991
19191
19991
11111

After step 1:
34543
40004
50005
40004
34543

After step 2:
45654
51115
61116
51115
45654

An octopus is highlighted when it flashed during the given time step. Part one asks us to find out how many total flashes are there after 100 steps?

Design phase: part one

The problem requires modeling the increment on every time step, and the propagation of the increment to the neighborhood when an energy level goes beyond 9. Thus, considering the input dataset as a 2D grayscale image, we can treat every energy level as a pixel. Thus, we need:

  1. Define a function that given a pixel returns the 8-neighborhood coordinates. Differently from the solution presented in day 9: design phase, where the problem required us to only look at the 4-neighborhood, this time we need to consider also the diagonal pixels. Moreover, we can’t simplify in the same way the problem padding the input image, since every update may trigger a cascade of updates that may involve the padding pixel introduced, and we don’t want it.
  2. Given that every flash may trigger a cascade of flash, we need to increment by 1 every value in the neighborhood of a flashing pixel and keep track of all the neighbors that exceeded the value of 9. Pushing the coordinates into a queue, and repeating the process for every pixel in the queue until the propagation is not complete.

Input pipeline

We create a tf.data.Dataset object for reading the text file line-by-line as usual. Since we want to work with all the pixels at once, we convert the dataset in a 2D tf.Tensor - this is our octopus population.

population = tf.convert_to_tensor(
    list(
        tf.data.TextLineDataset("fake")
        .map(tf.strings.bytes_split)
        .map(lambda string: tf.strings.to_number(string, out_type=tf.int64))
    )
)

We can now start implementing our TensorFlow program for solving part 1. We can start by the definition of the _neighs function.

The 8-neighborhood

The first point of the design phase explains why we need to take care of the 8-neighborhood without padding the input. Hence, our neighborhood will contain 8 pixels if we are inside the image. It will contain 3 or 5 pixels of we are on a corner or on the side of the image, respectively.

We need to take care of these conditions manually.

@staticmethod
@tf.function
def _neighs(grid: tf.Tensor, center: tf.Tensor) -> Tuple[tf.Tensor, tf.Tensor]:
    y, x = center[0], center[1]

    shape = tf.shape(grid, tf.int64) - 1

    if tf.logical_and(tf.less(y, 1), tf.less(x, 1)):  # 0,0
        mask = tf.constant([(1, 0), (0, 1), (1, 1)])
    elif tf.logical_and(tf.equal(y, shape[0]), tf.equal(x, shape[1])):  # h,w
        mask = tf.constant([(-1, 0), (0, -1), (-1, -1)])
    elif tf.logical_and(tf.less(y, 1), tf.equal(x, shape[1])):  # top right
        mask = tf.constant([(0, -1), (1, 0), (1, -1)])
    elif tf.logical_and(tf.less(x, 1), tf.equal(y, shape[0])):  # bottom left
        mask = tf.constant([(-1, 0), (-1, 1), (0, 1)])
    elif tf.less(x, 1):  # left
        mask = tf.constant([(1, 0), (-1, 0), (-1, 1), (0, 1), (1, 1)])
    elif tf.equal(x, shape[0]):  # right
        mask = tf.constant([(-1, 0), (1, 0), (0, -1), (-1, -1), (1, -1)])
    elif tf.less(y, 1):  # top
        mask = tf.constant([(0, -1), (0, 1), (1, 0), (1, -1), (1, 1)])
    elif tf.equal(y, shape[1]):  # bottom
        mask = tf.constant([(0, -1), (0, 1), (-1, 0), (-1, -1), (-1, 1)])
    else:  # generic
        mask = tf.constant(
            [(-1, 0), (0, -1), (1, 0), (0, 1), (-1, -1), (1, 1), (-1, 1), (1, -1)]
        )

    coords = center + tf.cast(mask, tf.int64)
    neighborhood = tf.gather_nd(grid, coords)
    return neighborhood, coords

The _neihgs function accepts the grid (2D tf.Tensor) and the coordinate of a pixel, that’s the center of the neighborhood.

Depending on the center coordinate respect to the grid, the function builds a different mask used to create the pixel coordinate and gather the value in that coordinates.

Note that TensorFlow coordinates, by convention, are in y,x format and not in x,y format. We are now ready for implementing the algorithm.

Simulating the population behavior

The puzzle asks us to simulate 100 steps, and count how many flashes happened during them. Thus, we need to completely simulate the octopus population behavior over time.

We already have our data organized in a grid, and from the second point of the design phase, we know that the unique data structure needed for model the propagation is a tf.queue.

def __init__(self, population, steps):
    super().__init__()

    self._steps = steps
    self._population = tf.Variable(population, dtype=tf.int64)
    self._counter = tf.Variable(0, dtype=tf.int64)

    self._zero = tf.constant(0, dtype=tf.int64)
    self._one = tf.constant(1, dtype=tf.int64)
    self._nine = tf.constant(9, tf.int64)
    self._ten = tf.constant(10, dtype=tf.int64)

    self._queue = tf.queue.FIFOQueue(-1, [tf.int64])

    self._flashmap = tf.Variable(tf.zeros_like(self._population))

The constructor requires the population and the number of steps to simulate. The population input variable is then used to initialize a tf.Variable that we’ll use to store the population status after every time step.

The _counter variable will be our puzzle output, and the _flashmap is a helper variable with the same shape of the population that we use to keep track of the flashing pixels during every time step - so we can know what pixel flashed and keep track of them.

Other than the _queue used for modeling the flash propagation, we also define some constant with the dtype tf.int64 that we’ll later use inside the TensorFlow program. This is a good practice to follow since we avoid the creation of several different (and useless) constants in the graph, and we’ll always refer to the same constants instead.

Modeling the population behavior is straightforward - it’s just a matter of converting in TensorFlow code the puzzle instructions.

@tf.function
def __call__(self):
    for step in tf.range(self._steps):
        # First, the energy level of each octopus increases by 1.
        self._population.assign_add(tf.ones_like(self._population))

        # Then, any octopus with an energy level greater than 9 flashes.
        flashing_coords = tf.where(tf.greater(self._population, self._nine))
        self._queue.enqueue_many(flashing_coords)

        # This increases the energy level of all adjacent octopuses by 1, including octopuses that are diagonally adjacent.
        # If this causes an octopus to have an energy level greater than 9, it also flashes.
        # This process continues as long as new octopuses keep having their energy level increased beyond 9.
        # (An octopus can only flash at most once per step.)
        while tf.greater(self._queue.size(), 0):
            p = self._queue.dequeue()
            if tf.greater(self._flashmap[p[0], p[1]], 0):
                continue
            self._flashmap.scatter_nd_update([p], [1])

            _, neighs_coords = self._neighs(self._population, p)
            updates = tf.repeat(
                self._one,
                tf.shape(neighs_coords, tf.int64)[0],
            )
            self._population.scatter_nd_add(neighs_coords, updates)
            flashing_coords = tf.where(tf.greater(self._population, self._nine))
            self._queue.enqueue_many(flashing_coords)

        # Finally, any octopus that flashed during this step has its energy level set to 0, as it used all of its energy to flash.
        indices = tf.where(tf.equal(self._flashmap, self._one))
        if tf.greater(tf.size(indices), 0):
            shape = tf.shape(indices, tf.int64)
            updates = tf.repeat(self._zero, shape[0])
            self._counter.assign_add(shape[0])
            self._population.scatter_nd_update(indices, updates)

        self._flashmap.assign(tf.zeros_like(self._flashmap))

        # tf.print(step, self._population, summarize=-1)
    return self._counter

The code is the precise implementation of the puzzle instructions, the comments are part of the original puzzle text and after every comment, there’s the equivalent TensorFlow implementation.

Execution

Here we go!

steps = tf.constant(100, tf.int64)
flash_counter = FlashCounter(population, steps)
tf.print("Part one: ", flash_counter())

Part one is solved! Let’s see what part two is about.

Design phase: part 2

The propagation of the flashes causes a nice phenomenon: the octopuses are synchronizing! For example


After step 194:
6988888888
9988888888
8888888888
8888888888
8888888888
8888888888
8888888888
8888888888
8888888888
8888888888

After step 195:
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000

Part two asks us to determine the first step during which all the octopuses flash.

Part two implementation

Our algorithm already models the population behavior, hence we already have implemented all we need to solve part two.

In fact, part two solution is identical to part 1 solution with the for loop replaced with a while. In part one, we have a fixed number of steps, instead this time we re-use the counter variable to keep track of how many steps we performed until the while condition holds.

@tf.function
def find_sync_step(self):
    # use count as step
    self._counter.assign(0)
    while tf.logical_not(tf.reduce_all(tf.equal(self._population, self._zero))):
         self._counter.assign_add(1)
         # First, the energy level of each octopus increases by 1.
         # [ very same code of previous solution, omitted!]
         # Full version:
         # https://github.com/galeone/tf-aoc/blob/main/11/main.py#L66

Since we re-used the same internal status of the FlashCounter class, we have two options. Create a new object for invoking the find_sync_step or speeding up the computation knowing that our search starts from step 100 used to solve part 1.

tf.print("Part two: ", steps + flash_counter.find_sync_step())

Here we go! Challenge 11 is solved in pure TensorFlow pretty easily.

Conclusion

You can see the complete solution in folder 11 on the dedicated Github repository: https://github.com/galeone/tf-aoc.

Solving this problem has, once again, demonstrated how TensorFlow can be used to solve any kind of programming problem, and how we can think of TensorFlow as a different programming language, especially if we are able to write pure TensorFlow programs (hence tf.function graph-converted functions).

If you missed the articles about the previous days’ solutions, here’s a handy list:

The next article will be about my solution to Day 12 problem. In that article I’ll show how to work with graphs in the “traditional” way: no neural networks, only an adjacency matrix, and a search algorithm implemented using recursion. However, tf.function can’t be used to write recursive functions :( hence I end up writing a pure TensorFlow eager solution.

For any feedback or comment, please use the Disqus form below - thanks!

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

Related Posts

Fixing the code signing and notarization issues of Unreal Engine (5.3+) projects

Starting from Unreal Engine 5.3, Epic Games added support for the so-called modern Xcode workflow. This workflow allows the Unreal Build Tool (UBT) to be more consistent with the standard Xcode app projects, and to be compliant with the Apple requirements for distributing applications... In theory! 😅 In practice this workflow is flawed: both the code signing and the framework supports are not correctly implemented, making the creation of working apps and their distribution impossible. In this article, we'll go through the problems faced during the packaging, code signing, and notarization of an Unreal Engine application on macOS and end up with the step-by-step process to solve them all.

The (Hidden?) Costs of Vertex AI Resource Pools: A Cautionary Tale

In the article "Custom model training & deployment on Google Cloud using Vertex AI in Go" we explored how to leverage Go to create a resource pool and train a machine learning model using Vertex AI's allocated resources. While this approach offers flexibility, there's a crucial aspect to consider: the cost implications of resource pools. This article details my experience with a sudden price increase in Vertex AI and the hidden culprit – a seemingly innocuous resource pool.

Building a RAG for tabular data in Go with PostgreSQL & Gemini

In this article we explore how to combine a large language model (LLM) with a relational database to allow users to ask questions about their data in a natural way. It demonstrates a Retrieval-Augmented Generation (RAG) system built with Go that utilizes PostgreSQL and pgvector for data storage and retrieval. The provided code showcases the core functionalities. This is an overview of how the "chat with your data" feature of fitsleepinsights.app is being developed.

Using Gemini in a Go application: limits and details

This article explores using Gemini within Go applications via Vertex AI. We'll delve into the limitations encountered, including the model's context window size and regional restrictions. We'll also explore various methods for feeding data to Gemini, highlighting the challenges faced due to these limitations. Finally, we'll briefly introduce RAG (Retrieval-Augmented Generation) as a potential solution, but leave its implementation details for future exploration.

Custom model training & deployment on Google Cloud using Vertex AI in Go

This article shows a different approach to solving the same problem presented in the article AutoML pipeline for tabular data on VertexAI in Go. This time, instead of relying on AutoML we will define the model and the training job ourselves. This is a more advanced usage that allows the experienced machine learning practitioner to have full control on the pipeline from the model definition to the hardware to use for training and deploying. At the end of the article, we will also see how to use the deployed model. All of this, in Go and with the help of Python and Docker for the custom training job definition.

Integrating third-party libraries as Unreal Engine plugins: solving the ABI compatibility issues on Linux when the source code is available

In this article, we will discuss the challenges and potential issues that may arise during the integration process of a third-party library when the source code is available. It will provide guidance on how to handle the compilation and linking of the third-party library, manage dependencies, and resolve compatibility issues. We'll realize a plugin for redis plus plus as a real use case scenario, and we'll see how tough can it be to correctly compile the library for Unreal Engine - we'll solve every problem step by step.

AutoML pipeline for tabular data on VertexAI in Go

In this article, we delve into the development and deployment of tabular models using VertexAI and AutoML with Go, showcasing the actual Go code and sharing insights gained through trial & error and extensive Google research to overcome documentation limitations.

Advent of Code 2022 in pure TensorFlow - Day 12

Solving problem 12 of the AoC 2022 in pure TensorFlow is a great exercise in graph theory and more specifically in using the Breadth-First Search (BFS) algorithm. This problem requires working with a grid of characters representing a graph, and the BFS algorithm allows us to traverse the graph in the most efficient way to solve the problem.

Advent of Code 2022 in pure TensorFlow - Day 11

In this article, we'll show how to solve problem 11 from the Advent of Code 2022 (AoC 2022) using TensorFlow. We'll first introduce the problem and then provide a detailed explanation of our TensorFlow solution. The problem at hand revolves around the interactions of multiple monkeys inspecting items, making decisions based on their worry levels, and following a set of rules.

Advent of Code 2022 in pure TensorFlow - Day 10

Solving problem 10 of the AoC 2022 in pure TensorFlow is an interesting challenge. This problem involves simulating a clock signal with varying frequencies and tracking the state of a signal-strength variable. TensorFlow's ability to handle complex data manipulations, control structures, and its @tf.function decorator for efficient execution makes it a fitting choice for tackling this problem. By utilizing TensorFlow's features such as Dataset transformations, efficient filtering, and tensor operations, we can create a clean and efficient solution to this intriguing puzzle.