Advent of Code 2021 in pure TensorFlow - day 5


The day 5 challenge is easily solvable in pure TensorFlow thanks to its support for various distance functions and the power of the tf.math package. The problem only requires some basic math knowledge to be completely solved - and a little bit of computer vision experience doesn’t hurt.

Day 5: Hydrothermal Venture

You can click on the title above to read the full text of the puzzle. The TLDR version is:

The puzzle input contains segments coordinates, in the format x1,y1 -> x2,y2, like

0,9 -> 5,9
8,0 -> 0,8
9,4 -> 3,4
2,2 -> 2,1
7,0 -> 7,4

A segment is not only start and end points, but it is made of all the points that connect start and end.

  • An entry like 1,1 -> 1,3 covers points 1,1, 1,2, and 1,3.
  • An entry like 9,7 -> 7,7 covers points 9,7, 8,7, and 7,7.

The puzzle asks us to “draw” all the lines and count the number of points where at least two lines overlap.

Part 1 asks us to focus only on the horizontal and vertical lines (x1 = x2 or y1 = y2), while (spoiler) part 2 asks to consider also diagonal segments at exactly 45 degrees.

We’ll design a solution for both parts, that will compute the correct output depending on a boolean flag.

Design phase

The problem is pretty trivial if one is used to working with pixels. In fact, the problem can be seen as the implementation of a segment drawing function over 2D gray-scale image.

  • The 2D image resolution is given by the minimum bounding box containing all the points.
  • The segment orientation can be easily found by using the atan2 on the direction vector.
  • The pixel coordinates to draw are the result of the linear interpolation between the start and end points.
  • The pixel coordinates must be integers since it makes no sense to draw in positions like (0.8, 1.5). This constraint is automatically solved by the problem constraints: in 2D grid segments at 45 degrees, horizontal, and vertical all of them have integer coordinates.
  • The number of pixels produced by the interpolation should be precisely the Chebyshev distance (also called \(L_\infty\) distance or chessboard distance) between the start and end points.

The last point is the most important one since it constrains the number of pixel coordinates generated by the linear interpolation and makes the coordinates always integers.

Input pipeline

We create a tf.data.Dataset object for reading the text file line-by-line as usual. All we need to do is to choose a handy representation for the elements produced at every iteration. I choose the traditional representation of a line using pairs of coordinates.

def _get_segment(
    line: tf.Tensor,
) -> Tuple[tf.Tensor, tf.Tensor]:
    points = tf.strings.split(line, " -> ")
    p1 = tf.strings.split(points[0], ",")
    p2 = tf.strings.split(points[1], ",")

    x1 = tf.strings.to_number(p1[0], tf.int64)
    y1 = tf.strings.to_number(p1[1], tf.int64)
    x2 = tf.strings.to_number(p2[0], tf.int64)
    y2 = tf.strings.to_number(p2[1], tf.int64)
    return tf.convert_to_tensor((x1, y1)), tf.convert_to_tensor((x2, y2))

dataset = tf.data.TextLineDataset("input").map(_get_segment)

The _get_segment function (that it’s executed in graph mode even if not decorated explicitly with @tf.function) strips the -> and creates the pair of points: ((x1, y1), (x2, y2)).

Creating the Grid

A 2D image is a grid of pixels. The resolution of the image (the number of columns and rows of the grid) is given by the minimum bounding box containing all the points.

Since we need to draw on this grid, this object should be a tf.Variable. We can thus define the __init__ of our TensorFlow program in this way.

class Grid(tf.Module):
    def __init__(self, dataset):
        super().__init__()
        bbox_w = tf.reduce_max(list(dataset.map(lambda p1, p2: (p1[0], p2[0])))) + 1
        bbox_h = tf.reduce_max(list(dataset.map(lambda p1, p2: (p1[1], p2[1])))) + 1
        self._grid = tf.Variable(
            tf.zeros((bbox_w, bbox_h), dtype=tf.int64), trainable=False
        )
        self._dataset = dataset

In short, we just searched the maximum coordinates along the x and y axes and added 1 because the given coordinates include zero as a valid location.

Before doing the actual drawing, we can implement an interpolation function that uses the Chebyshev norm.

Chessboard interpolation

As presented in the design section, the 2D grid constraints the pixel coordinates to be integers. Using the Chebyshev distance is the correct way of finding the number of pixels to generate from the interpolation process.

TensorFlow allows us to interpolate along the x and y axes in parallel with a single function call.

@staticmethod
@tf.function
def interpolate(p1: tf.Tensor, p2: tf.Tensor):
    """Linear interpolation from p1 to p2 in the discrete 2D grid.
    Args:
        p1: Tensor with values (x, y)
        p2: Tensor with values (x, y)
    Returns:
        The linear interpolation in the discrete 2D grid.
    """
    # +1 handles the case of p1 - p2 == 1
    norm = tf.norm(tf.cast(p1 - p2, tf.float32), ord=tf.experimental.numpy.inf) + 1
    return tf.cast(
        tf.math.ceil(tf.linspace(p1, p2, tf.cast(norm, tf.int64))), tf.int64
    )

Since the challenge in the challenge is to only use TensorFlow, I used tf.experimental.numpy.inf instead of numpy.inf for specifying the inf value required by the tf.norm function to use the Chessboard distance ( \(L_\infty\)).

There’s now only to implement the drawing logic.

Drawing lines on the grid

Having decided to solve with the same code both parts of the puzzle, I can just define the call method accepting a boolean flag that will change the behavior of the method.

Being stateful, we need to remember to reset to 0 the grid state when the method is called, before starting looping over the self._dataset.

@tf.function
def __call__(self, part_one: tf.Tensor) -> tf.Tensor:
    """Given the required puzzle part, changes the line drawing on the grid
    and the intersection couunt.
    Args:
        part_one: boolean tensor. When true, only consider straight lines and
                  a threshold of 1. When false, consider straight lines and diagonal
                  lines.
    Returns
        the number of intersections
    """
    self._grid.assign(tf.zeros_like(self._grid))

Now, we can change the loop behavior depending on the part_one value. We can define the logic as follows

  • Use tf.math.atan2 over the direction vector to get the angle in radians. Convert it to degrees and make it always positive by summing 360 if it’s negative.
  • If part_one is True: consider only horizontal and vertical lines.
  • If part_one is False: check if the angle is exactly at 45 degrees or the lines are horizontal and vertical.

In both cases, if the condition holds, we can interpolate between the start and end points to get the pixels and draw them on the image using the assign method of tf.Variable and the tf.tensor_scatter_nd_add function for incrementing by 1 only the values in the pixels coordinate.

    for start, end in self._dataset:
        # Discrete interpolation between start and end
        # part 1 requires to consider only straight lines
        # (x1 = x2 or y1 = y2)
        # but I guess (hope) doing the generic discrete interpolation
        # will simplify part 2 (no idea, just a guess)
        float_start = tf.cast(start, tf.float32)
        float_end = tf.cast(end, tf.float32)
        direction = float_start - float_end
        angle = (
            tf.math.atan2(direction[1], direction[0])
            * 180
            / tf.experimental.numpy.pi
        )
        if tf.less(angle, 0):
            angle = 360 + angle
        if tf.logical_or(
            tf.logical_and(
                tf.logical_not(part_one),
                tf.logical_and(
                    tf.logical_not(tf.equal(tf.math.mod(angle, 90), 0)),
                    tf.equal(tf.math.mod(angle, 45), 0),
                ),
            ),
            tf.logical_or(
                tf.equal(start[0], end[0]),
                tf.equal(start[1], end[1]),
            ),
        ):
            pixels = self.interpolate(start, end)
            self._grid.assign(
                tf.tensor_scatter_nd_add(
                    self._grid, pixels, tf.ones(tf.shape(pixels)[0], dtype=tf.int64)
                )
            )

At the end of the loop, we can just check how many values are greater than 1 and count them, to get the expected result.

    # tf.print(tf.transpose(grid, perm=(1, 0)), summarize=-1)
    threshold = tf.constant(1, tf.int64)
    mask = tf.greater(self._grid, threshold)
    return tf.reduce_sum(tf.cast(mask, tf.int64))

Execution

Here we go!

grid = Grid(dataset)

tf.print("# overlaps (part one): ", grid(tf.constant(True)))
tf.print("# overlaps (part two): ", grid(tf.constant(False)))

Day 5 puzzle solved in both parts :)

Conclusion

You can see the complete solution in folder 5 on the dedicated Github repository: https://github.com/galeone/tf-aoc.

Solving this problem having experience in the computer vision domain has been easy since there are lots of concepts from that domain that can be applied in this problem resolution.

The challenge in the challenge of using only TensorFlow for solving the problem is slowly progressing, so far I solved all the puzzles up to Day 8 (inclusive). So get ready for at least 3 more articles :) Let’s see when (and if!) TensorFlow alone won’t be enough.

If you missed the articles about the previous days’ solutions, here’s a handy list:

The next article will be about my solution of Day 6 problem. I’ll present 2 different solutions, one computationally intensive but easy to understand, and another developed for solving the part 2 - that’s the very same problem of part 1 with a different input parameter. This small change will show how the previously developed solution is inefficient and requires us to approach the problem in a different way. Stay tuned!

For any feedback or comment, please use the Disqus form below - thanks!

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

Related Posts

Fixing the code signing and notarization issues of Unreal Engine (5.3+) projects

Starting from Unreal Engine 5.3, Epic Games added support for the so-called modern Xcode workflow. This workflow allows the Unreal Build Tool (UBT) to be more consistent with the standard Xcode app projects, and to be compliant with the Apple requirements for distributing applications... In theory! 😅 In practice this workflow is flawed: both the code signing and the framework supports are not correctly implemented, making the creation of working apps and their distribution impossible. In this article, we'll go through the problems faced during the packaging, code signing, and notarization of an Unreal Engine application on macOS and end up with the step-by-step process to solve them all.

The (Hidden?) Costs of Vertex AI Resource Pools: A Cautionary Tale

In the article "Custom model training & deployment on Google Cloud using Vertex AI in Go" we explored how to leverage Go to create a resource pool and train a machine learning model using Vertex AI's allocated resources. While this approach offers flexibility, there's a crucial aspect to consider: the cost implications of resource pools. This article details my experience with a sudden price increase in Vertex AI and the hidden culprit – a seemingly innocuous resource pool.

Building a RAG for tabular data in Go with PostgreSQL & Gemini

In this article we explore how to combine a large language model (LLM) with a relational database to allow users to ask questions about their data in a natural way. It demonstrates a Retrieval-Augmented Generation (RAG) system built with Go that utilizes PostgreSQL and pgvector for data storage and retrieval. The provided code showcases the core functionalities. This is an overview of how the "chat with your data" feature of fitsleepinsights.app is being developed.

Using Gemini in a Go application: limits and details

This article explores using Gemini within Go applications via Vertex AI. We'll delve into the limitations encountered, including the model's context window size and regional restrictions. We'll also explore various methods for feeding data to Gemini, highlighting the challenges faced due to these limitations. Finally, we'll briefly introduce RAG (Retrieval-Augmented Generation) as a potential solution, but leave its implementation details for future exploration.

Custom model training & deployment on Google Cloud using Vertex AI in Go

This article shows a different approach to solving the same problem presented in the article AutoML pipeline for tabular data on VertexAI in Go. This time, instead of relying on AutoML we will define the model and the training job ourselves. This is a more advanced usage that allows the experienced machine learning practitioner to have full control on the pipeline from the model definition to the hardware to use for training and deploying. At the end of the article, we will also see how to use the deployed model. All of this, in Go and with the help of Python and Docker for the custom training job definition.

Integrating third-party libraries as Unreal Engine plugins: solving the ABI compatibility issues on Linux when the source code is available

In this article, we will discuss the challenges and potential issues that may arise during the integration process of a third-party library when the source code is available. It will provide guidance on how to handle the compilation and linking of the third-party library, manage dependencies, and resolve compatibility issues. We'll realize a plugin for redis plus plus as a real use case scenario, and we'll see how tough can it be to correctly compile the library for Unreal Engine - we'll solve every problem step by step.

AutoML pipeline for tabular data on VertexAI in Go

In this article, we delve into the development and deployment of tabular models using VertexAI and AutoML with Go, showcasing the actual Go code and sharing insights gained through trial & error and extensive Google research to overcome documentation limitations.

Advent of Code 2022 in pure TensorFlow - Day 12

Solving problem 12 of the AoC 2022 in pure TensorFlow is a great exercise in graph theory and more specifically in using the Breadth-First Search (BFS) algorithm. This problem requires working with a grid of characters representing a graph, and the BFS algorithm allows us to traverse the graph in the most efficient way to solve the problem.

Advent of Code 2022 in pure TensorFlow - Day 11

In this article, we'll show how to solve problem 11 from the Advent of Code 2022 (AoC 2022) using TensorFlow. We'll first introduce the problem and then provide a detailed explanation of our TensorFlow solution. The problem at hand revolves around the interactions of multiple monkeys inspecting items, making decisions based on their worry levels, and following a set of rules.

Advent of Code 2022 in pure TensorFlow - Day 10

Solving problem 10 of the AoC 2022 in pure TensorFlow is an interesting challenge. This problem involves simulating a clock signal with varying frequencies and tracking the state of a signal-strength variable. TensorFlow's ability to handle complex data manipulations, control structures, and its @tf.function decorator for efficient execution makes it a fitting choice for tackling this problem. By utilizing TensorFlow's features such as Dataset transformations, efficient filtering, and tensor operations, we can create a clean and efficient solution to this intriguing puzzle.