Advent of Code 2021 in pure TensorFlow - day 5


The day 5 challenge is easily solvable in pure TensorFlow thanks to its support for various distance functions and the power of the tf.math package. The problem only requires some basic math knowledge to be completely solved - and a little bit of computer vision experience doesn’t hurt.

Day 5: Hydrothermal Venture

You can click on the title above to read the full text of the puzzle. The TLDR version is:

The puzzle input contains segments coordinates, in the format x1,y1 -> x2,y2, like

0,9 -> 5,9
8,0 -> 0,8
9,4 -> 3,4
2,2 -> 2,1
7,0 -> 7,4

A segment is not only start and end points, but it is made of all the points that connect start and end.

  • An entry like 1,1 -> 1,3 covers points 1,1, 1,2, and 1,3.
  • An entry like 9,7 -> 7,7 covers points 9,7, 8,7, and 7,7.

The puzzle asks us to “draw” all the lines and count the number of points where at least two lines overlap.

Part 1 asks us to focus only on the horizontal and vertical lines (x1 = x2 or y1 = y2), while (spoiler) part 2 asks to consider also diagonal segments at exactly 45 degrees.

We’ll design a solution for both parts, that will compute the correct output depending on a boolean flag.

Design phase

The problem is pretty trivial if one is used to working with pixels. In fact, the problem can be seen as the implementation of a segment drawing function over 2D gray-scale image.

  • The 2D image resolution is given by the minimum bounding box containing all the points.
  • The segment orientation can be easily found by using the atan2 on the direction vector.
  • The pixel coordinates to draw are the result of the linear interpolation between the start and end points.
  • The pixel coordinates must be integers since it makes no sense to draw in positions like (0.8, 1.5). This constraint is automatically solved by the problem constraints: in 2D grid segments at 45 degrees, horizontal, and vertical all of them have integer coordinates.
  • The number of pixels produced by the interpolation should be precisely the Chebyshev distance (also called \(L_\infty\) distance or chessboard distance) between the start and end points.

The last point is the most important one since it constrains the number of pixel coordinates generated by the linear interpolation and makes the coordinates always integers.

Input pipeline

We create a tf.data.Dataset object for reading the text file line-by-line as usual. All we need to do is to choose a handy representation for the elements produced at every iteration. I choose the traditional representation of a line using pairs of coordinates.

def _get_segment(
    line: tf.Tensor,
) -> Tuple[tf.Tensor, tf.Tensor]:
    points = tf.strings.split(line, " -> ")
    p1 = tf.strings.split(points[0], ",")
    p2 = tf.strings.split(points[1], ",")

    x1 = tf.strings.to_number(p1[0], tf.int64)
    y1 = tf.strings.to_number(p1[1], tf.int64)
    x2 = tf.strings.to_number(p2[0], tf.int64)
    y2 = tf.strings.to_number(p2[1], tf.int64)
    return tf.convert_to_tensor((x1, y1)), tf.convert_to_tensor((x2, y2))

dataset = tf.data.TextLineDataset("input").map(_get_segment)

The _get_segment function (that it’s executed in graph mode even if not decorated explicitly with @tf.function) strips the -> and creates the pair of points: ((x1, y1), (x2, y2)).

Creating the Grid

A 2D image is a grid of pixels. The resolution of the image (the number of columns and rows of the grid) is given by the minimum bounding box containing all the points.

Since we need to draw on this grid, this object should be a tf.Variable. We can thus define the __init__ of our TensorFlow program in this way.

class Grid(tf.Module):
    def __init__(self, dataset):
        super().__init__()
        bbox_w = tf.reduce_max(list(dataset.map(lambda p1, p2: (p1[0], p2[0])))) + 1
        bbox_h = tf.reduce_max(list(dataset.map(lambda p1, p2: (p1[1], p2[1])))) + 1
        self._grid = tf.Variable(
            tf.zeros((bbox_w, bbox_h), dtype=tf.int64), trainable=False
        )
        self._dataset = dataset

In short, we just searched the maximum coordinates along the x and y axes and added 1 because the given coordinates include zero as a valid location.

Before doing the actual drawing, we can implement an interpolation function that uses the Chebyshev norm.

Chessboard interpolation

As presented in the design section, the 2D grid constraints the pixel coordinates to be integers. Using the Chebyshev distance is the correct way of finding the number of pixels to generate from the interpolation process.

TensorFlow allows us to interpolate along the x and y axes in parallel with a single function call.

@staticmethod
@tf.function
def interpolate(p1: tf.Tensor, p2: tf.Tensor):
    """Linear interpolation from p1 to p2 in the discrete 2D grid.
    Args:
        p1: Tensor with values (x, y)
        p2: Tensor with values (x, y)
    Returns:
        The linear interpolation in the discrete 2D grid.
    """
    # +1 handles the case of p1 - p2 == 1
    norm = tf.norm(tf.cast(p1 - p2, tf.float32), ord=tf.experimental.numpy.inf) + 1
    return tf.cast(
        tf.math.ceil(tf.linspace(p1, p2, tf.cast(norm, tf.int64))), tf.int64
    )

Since the challenge in the challenge is to only use TensorFlow, I used tf.experimental.numpy.inf instead of numpy.inf for specifying the inf value required by the tf.norm function to use the Chessboard distance ( \(L_\infty\)).

There’s now only to implement the drawing logic.

Drawing lines on the grid

Having decided to solve with the same code both parts of the puzzle, I can just define the call method accepting a boolean flag that will change the behavior of the method.

Being stateful, we need to remember to reset to 0 the grid state when the method is called, before starting looping over the self._dataset.

@tf.function
def __call__(self, part_one: tf.Tensor) -> tf.Tensor:
    """Given the required puzzle part, changes the line drawing on the grid
    and the intersection couunt.
    Args:
        part_one: boolean tensor. When true, only consider straight lines and
                  a threshold of 1. When false, consider straight lines and diagonal
                  lines.
    Returns
        the number of intersections
    """
    self._grid.assign(tf.zeros_like(self._grid))

Now, we can change the loop behavior depending on the part_one value. We can define the logic as follows

  • Use tf.math.atan2 over the direction vector to get the angle in radians. Convert it to degrees and make it always positive by summing 360 if it’s negative.
  • If part_one is True: consider only horizontal and vertical lines.
  • If part_one is False: check if the angle is exactly at 45 degrees or the lines are horizontal and vertical.

In both cases, if the condition holds, we can interpolate between the start and end points to get the pixels and draw them on the image using the assign method of tf.Variable and the tf.tensor_scatter_nd_add function for incrementing by 1 only the values in the pixels coordinate.

    for start, end in self._dataset:
        # Discrete interpolation between start and end
        # part 1 requires to consider only straight lines
        # (x1 = x2 or y1 = y2)
        # but I guess (hope) doing the generic discrete interpolation
        # will simplify part 2 (no idea, just a guess)
        float_start = tf.cast(start, tf.float32)
        float_end = tf.cast(end, tf.float32)
        direction = float_start - float_end
        angle = (
            tf.math.atan2(direction[1], direction[0])
            * 180
            / tf.experimental.numpy.pi
        )
        if tf.less(angle, 0):
            angle = 360 + angle
        if tf.logical_or(
            tf.logical_and(
                tf.logical_not(part_one),
                tf.logical_and(
                    tf.logical_not(tf.equal(tf.math.mod(angle, 90), 0)),
                    tf.equal(tf.math.mod(angle, 45), 0),
                ),
            ),
            tf.logical_or(
                tf.equal(start[0], end[0]),
                tf.equal(start[1], end[1]),
            ),
        ):
            pixels = self.interpolate(start, end)
            self._grid.assign(
                tf.tensor_scatter_nd_add(
                    self._grid, pixels, tf.ones(tf.shape(pixels)[0], dtype=tf.int64)
                )
            )

At the end of the loop, we can just check how many values are greater than 1 and count them, to get the expected result.

    # tf.print(tf.transpose(grid, perm=(1, 0)), summarize=-1)
    threshold = tf.constant(1, tf.int64)
    mask = tf.greater(self._grid, threshold)
    return tf.reduce_sum(tf.cast(mask, tf.int64))

Execution

Here we go!

grid = Grid(dataset)

tf.print("# overlaps (part one): ", grid(tf.constant(True)))
tf.print("# overlaps (part two): ", grid(tf.constant(False)))

Day 5 puzzle solved in both parts :)

Conclusion

You can see the complete solution in folder 5 on the dedicated Github repository: https://github.com/galeone/tf-aoc.

Solving this problem having experience in the computer vision domain has been easy since there are lots of concepts from that domain that can be applied in this problem resolution.

The challenge in the challenge of using only TensorFlow for solving the problem is slowly progressing, so far I solved all the puzzles up to Day 8 (inclusive). So get ready for at least 3 more articles :) Let’s see when (and if!) TensorFlow alone won’t be enough.

If you missed the articles about the previous days’ solutions, here’s a handy list:

The next article will be about my solution of Day 6 problem. I’ll present 2 different solutions, one computationally intensive but easy to understand, and another developed for solving the part 2 - that’s the very same problem of part 1 with a different input parameter. This small change will show how the previously developed solution is inefficient and requires us to approach the problem in a different way. Stay tuned!

For any feedback or comment, please use the Disqus form below - thanks!

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

Related Posts

Using Gemini in a Go application: limits and details

This article explores using Gemini within Go applications via Vertex AI. We'll delve into the limitations encountered, including the model's context window size and regional restrictions. We'll also explore various methods for feeding data to Gemini, highlighting the challenges faced due to these limitations. Finally, we'll briefly introduce RAG (Retrieval-Augmented Generation) as a potential solution, but leave its implementation details for future exploration.

Custom model training & deployment on Google Cloud using Vertex AI in Go

This article shows a different approach to solving the same problem presented in the article AutoML pipeline for tabular data on VertexAI in Go. This time, instead of relying on AutoML we will define the model and the training job ourselves. This is a more advanced usage that allows the experienced machine learning practitioner to have full control on the pipeline from the model definition to the hardware to use for training and deploying. At the end of the article, we will also see how to use the deployed model. All of this, in Go and with the help of Python and Docker for the custom training job definition.

Integrating third-party libraries as Unreal Engine plugins: solving the ABI compatibility issues on Linux when the source code is available

In this article, we will discuss the challenges and potential issues that may arise during the integration process of a third-party library when the source code is available. It will provide guidance on how to handle the compilation and linking of the third-party library, manage dependencies, and resolve compatibility issues. We'll realize a plugin for redis plus plus as a real use case scenario, and we'll see how tough can it be to correctly compile the library for Unreal Engine - we'll solve every problem step by step.

AutoML pipeline for tabular data on VertexAI in Go

In this article, we delve into the development and deployment of tabular models using VertexAI and AutoML with Go, showcasing the actual Go code and sharing insights gained through trial & error and extensive Google research to overcome documentation limitations.

Advent of Code 2022 in pure TensorFlow - Day 12

Solving problem 12 of the AoC 2022 in pure TensorFlow is a great exercise in graph theory and more specifically in using the Breadth-First Search (BFS) algorithm. This problem requires working with a grid of characters representing a graph, and the BFS algorithm allows us to traverse the graph in the most efficient way to solve the problem.

Advent of Code 2022 in pure TensorFlow - Day 11

In this article, we'll show how to solve problem 11 from the Advent of Code 2022 (AoC 2022) using TensorFlow. We'll first introduce the problem and then provide a detailed explanation of our TensorFlow solution. The problem at hand revolves around the interactions of multiple monkeys inspecting items, making decisions based on their worry levels, and following a set of rules.

Advent of Code 2022 in pure TensorFlow - Day 10

Solving problem 10 of the AoC 2022 in pure TensorFlow is an interesting challenge. This problem involves simulating a clock signal with varying frequencies and tracking the state of a signal-strength variable. TensorFlow's ability to handle complex data manipulations, control structures, and its @tf.function decorator for efficient execution makes it a fitting choice for tackling this problem. By utilizing TensorFlow's features such as Dataset transformations, efficient filtering, and tensor operations, we can create a clean and efficient solution to this intriguing puzzle.

Advent of Code 2022 in pure TensorFlow - Day 9

In this article, we'll show two different solutions to the Advent of Code 2022 day 9 problem. Both of them are purely TensorFlow solutions. The first one, more traditional, just implement a solution algorithm using only TensorFlow's primitive operations - of course, due to some TensorFlow limitations this solution will contain some details worth reading (e.g. using a pairing function for being able to use n-dimensional tf.Tensor as keys for a mutable hashmap). The second one, instead, demonstrates how a different interpretation of the problem paves the way to completely different solutions. In particular, this solution is Keras based and uses a multi-layer convolutional model for modeling the rope movements.

Advent of Code 2022 in pure TensorFlow - Day 8

Solving problem 8 of the AoC 2022 in pure TensorFlow is straightforward. After all, this problem requires working on a bi-dimensional grid and evaluating conditions by rows or columns. TensorFlow is perfectly suited for this kind of task thanks to its native support for reduction operators (tf.reduce) which are the natural choice for solving problems of this type.

Advent of Code 2022 in pure TensorFlow - Day 7

Solving problem 7 of the AoC 2022 in pure TensorFlow allows us to understand certain limitations of the framework. This problem requires a lot of string manipulation, and TensorFlow (especially in graph mode) is not only not easy to use when working with this data type, but also it has a set of limitations I'll present in the article. Additionally, the strings to work with in problem 7 are (Unix) paths. TensorFlow has zero support for working with paths, and thus for simplifying a part of the solution, I resorted to the pathlib Python module, thus not designing a completely pure TensorFlow solution.