Advent of Code 2021 in pure TensorFlow - day 12


Day 12 problem projects us the world of graphs. TensorFlow can be used to work on graphs pretty easily since a graph can be represented as an adjacency matrix, and thus, we can have a tf.Tensor containing our graph. However, the “natural” way of exploring a graph is using recursion, and as we’ll see in this article, this prevents us to solve the problem using a pure TensorFlow program, but we have to work only in eager mode.

Day 12: Passage Pathing

You can click on the title above to read the full text of the puzzle. The TLDR version is:

Our dataset is a list of connections with this format:

start-A
start-b
A-c
A-b
b-d
A-end
b-end

where start and end are your start and end points, the upper case letters (e.g A) are the “big caves”, and the lower case letters (e.g. c) are the “small caves”.

The above cave system looks roughly like this:

    start
    /   \
c--A-----b--d
    \   /
     end

The puzzle asks to find the number of distinct paths that start at start, and end at end, knowing that you can visit the “big caves” multiple times, and the small caves only once (per path).

Given these rules, there are 10 paths through this example cave system (graph)

start,A,b,A,c,A,end
start,A,b,A,end
start,A,b,end
start,A,c,A,b,A,end
start,A,c,A,b,end
start,A,c,A,end
start,A,end
start,b,A,c,A,end
start,b,A,end
start,b,end

Design phase: part one

The problem asks us to:

  1. Model the relationships between the nodes
  2. Finding the paths from start to end that follow the two rules about the big/small caves

Modeling the relationships between nodes is something we can achieve by mapping the neighboring relationship among nodes using an adjacency matrix. In practice, we can assign an unique index to a node (e.g. “start” = 0, “end” = 1, “A” = 2, …), and each index identifies a row/column in the adjacency matrix. Whenever a node is connected to another, we put a “1” in the correct location.

For example, a possible adjacency matrix for the example graph is

\[\begin{pmatrix} 0&1&1&0&0&0\\ 1&0&1&1&0&1\\ 1&1&0&0&1&1\\ 0&1&0&0&0&0\\ 0&0&1&0&0&0\\ 0&1&1&0&0&0\\ \end{pmatrix}\]

As it can be easily seen the adjacency matrix for a graph without self connections is a hollow matrix that means that all the elements along the diagonal are equal to zero. Moreover, the adjacency matrix is also symmetric (rows and columns can be transposed and the matrix doesn’t change - and it makes sense since if A is connected with B also B is connected with A).

Thus, we first need to create the mapping between the “human-readable” nodes and the IDs, then characterize these IDs. In fact, we have some rules to follow. All the upper case nodes (and thus the corresponding numeric IDs) can be visited only once, while the other can be visited multiple times.

Moreover, the start and end nodes are special, since if we reach the end the path is complete and we can exit from our search algorithm, while the start node shouldn’t be visited more than once (at the beginning and then ignored).

Input pipeline

We create a tf.data.Dataset object for reading the text file line-by-line as usual. We split every line looking at the - separator, and then create the relationships between the human-readable nodes and the indexes of the adjacency matrix

connections = tf.data.TextLineDataset("input").map(
    lambda string: tf.strings.split(string, "-")
)

# Create a map between human-readable node names and numeric indices
human_to_id = tf.lookup.experimental.MutableHashTable(tf.string, tf.int64, -1)
id_to_human = tf.lookup.experimental.MutableHashTable(tf.int64, tf.string, "")

human_to_id is the hashtable we use to create the mapping between the human-readable node name and a numeric ID. tf.lookup.experimental.MutableHashTable returns the third parameter (-1) when the lookup fails, and thus we can use this feature to check if we already created the mapping for a node.

Adjacency matrix

Since we want to create an adjacency matrix, we need to keep track of the adjacency relations between nodes (the indices we use for setting the 1 in the correct positions of the adjacency matrix), we use a tf.TensorArray for doing it.

idx = tf.Variable(0, dtype=tf.int64)
indices = tf.TensorArray(tf.int64, size=0, dynamic_size=True)
for edge in connections:
    node_i = human_to_id.lookup(edge[0])
    node_j = human_to_id.lookup(edge[1])

    if tf.equal(node_i, -1):
        human_to_id.insert([edge[0]], [idx])
        id_to_human.insert([idx], [edge[0]])
        node_i = tf.identity(idx)
        idx.assign_add(1)
    if tf.equal(node_j, -1):
        human_to_id.insert([edge[1]], [idx])
        id_to_human.insert([idx], [edge[1]])
        node_j = tf.identity(idx)
        idx.assign_add(1)

    ij = tf.convert_to_tensor([node_i, node_j])
    indices = indices.write(indices.size(), ij)

We loop over all the connections, check if the node has already been mapped and, if not, increase the counter (idx) and then create the mapping. Then, we create the relationship between the i and j nodes and save it inside the tf.TensorArray.

At the end of this loop, the indices variable contains all the pairs of coordinates where a connection is, while idx contains the ID of the latest mapped item, which also is the width (height) of the matrix.

We should also note that we created the relationship from i to j, hence we created an upper triangular matrix (a matrix whose elements below the main diagonals are zero). For modeling the full relationship i to j and j to i we can just transpose the matrix and sum it with itself.

indices = indices.stack()
indices = tf.reshape(indices, (-1, 2))
A = tf.tensor_scatter_nd_update(
    tf.zeros((idx, idx), dtype=tf.int64),
    indices,
    tf.repeat(tf.cast(1, tf.int64), tf.shape(indices)[0]),
)
A = A + tf.transpose(A)

Here we go! A is the adjacency matrix that completely models the graph.

The only thing missing is the modeling of the constraints. We need a list-like object (a tf.Tensor) containing all the IDs of the nodes that we can visit once, and the difference between this set of IDs and the full list of IDs for getting the list of the IDs we can visit multiple times.

keys = human_to_id.export()[0]
visit_only_once_human = tf.gather(
    keys,
    tf.where(
        tf.equal(
            tf.range(tf.shape(keys)[0]),
            tf.cast(tf.strings.regex_full_match(keys, "[a-z]+?"), tf.int32)
            * tf.range(tf.shape(keys)[0]),
        )
    ),
)
visit_only_once_human = tf.squeeze(visit_only_once_human)
visit_only_once_id = human_to_id.lookup(visit_only_once_human)

# Visit multiple times = {keys} - {only once}
visit_multiple_times_human = tf.sparse.to_dense(
    tf.sets.difference(
        tf.reshape(keys, (1, -1)), tf.reshape(visit_only_once_human, (1, -1))
    )
)
visit_multiple_times_human = tf.squeeze(visit_multiple_times_human)
visit_multiple_times_id = human_to_id.lookup(visit_multiple_times_human)

It’s interesting how we used regular expressions in TensorFlow with the tf.strings.regex_full_match function for searching for nodes all in lower case ([a-z]+? - visit only once).

Moreover, we used the set difference together with the conversion from sparse to dense to get a tf.Tensor containing all the upper-case nodes (visit multiple times).

Visiting the graph

The natural way of facing this problem is by using recursion. In fact, the problem can be modeled as follows

Concatenate the input node to the current_path.

Is the current node the end node? If yes, we found the path. Increment by 1 the counter of the paths. Return.

Otherwise, find the neighborhood of the current node.

For every neighbor verify if the neighbor is in the “visit only once” list and it’s in the current path. If yes, skip this neighbor.

Otherwise, visit the neighbor (recursive step).

Being the adjacency matrix symmetric, the step of “find the neighborhood of the current node” can be simply modeled as a search for the 1 in the row whose id is node_id.

@tf.function
def _neigh_ids(A, node_id):
    return tf.squeeze(tf.where(tf.equal(A[node_id, :], 1)))

Since the proposed solution uses the recursion we cannot decorate the visit function with @tf.function because tf.function does not support recursion. Thus, we are forced to solve this problem in eager mode.

Going back to the problem, the puzzle asks us to count the number of paths hence we need a tf.Variable for counting the number of paths found. Moreover, instead of repeating on every iteration a lookup in the hastable, we can extract the IDs of the start and end nodes.

start_id, end_id = human_to_id.lookup(["start", "end"])
count = tf.Variable(0, dtype=tf.int64)

The visit function can be defined by implementing precisely what has been proposed in the algorithm described above.

def _visit(A: tf.Tensor, node_id: tf.Tensor, path: tf.Tensor):
    current_path = tf.concat([path, [node_id]], axis=0)
    if tf.equal(node_id, end_id):
        count.assign_add(1)
        return current_path

    neighs = _neigh_ids(A, node_id)
    neigh_shape = tf.shape(neighs)
    if tf.equal(tf.size(neighs), 0):
        return current_path

    if tf.equal(tf.size(neigh_shape), 0):
        neighs = tf.expand_dims(neighs, 0)
        neigh_shape = tf.shape(neighs)

    for idx in tf.range(neigh_shape[0]):
        neigh_id = neighs[idx]
        if tf.logical_and(
            tf.reduce_any(tf.equal(neigh_id, visit_only_once_id)),
            tf.reduce_any(tf.equal(neigh_id, current_path)),
        ):
            continue
        # Recursion step
        _visit(A, neigh_id, current_path)
    return current_path

Execution

To call the _visit function we need:

  • A the adjacency matrix
  • node_id a node for starting the visit
  • path the initial path (a tf.Tensor containing only the ID of the start node).
# All the paths starts from start
neighs = _neigh_ids(A, start_id)
for idx in tf.range(tf.shape(neighs)[0]):
    neigh_id = neighs[idx]
    _visit(A, neigh_id, [start_id])

tf.print("Part one: ", count)

Here we go! Part one is solved! Let’s see what part two is about.

Design phase: part 2

The puzzle relaxes a constraint: in a path, we can pass twice for a single small cave. For example start,A,b,A,b,A,c,A,end is a valid path because we visited b twice and c once.

Given this relaxed constraint, how many paths through the graph are there?

Part two implementation

It is possible to simply add the new constraint to the previous algorithm. We can define an inner_count tf.Variable for counting the number of times small cave in the current path has been visited. When this counter is greater or equal to 2 then the path we are following is invalid and we can return it.

count.assign(0)
inner_count = tf.Variable(0)

def _visit2(A: tf.Tensor, node_id: tf.Tensor, path: tf.Tensor):
    current_path = tf.concat([path, [node_id]], axis=0)

    # Skip start
    if tf.equal(node_id, start_id):
        return current_path

    # Success on end node
    if tf.equal(node_id, end_id):
        # paths.append(current_path)
        count.assign_add(1)
        return current_path

    # More than 2 lowercase visited twice
    visited, visited_idx, visited_count = tf.unique_with_counts(current_path)
    visited = tf.gather_nd(visited, tf.where(tf.greater(visited_count, 1)))
    inner_count.assign(0)
    for idx in tf.range(tf.shape(visited)[0]):
        if tf.reduce_any(tf.equal(visited[idx], visit_only_once_id)):
            inner_count.assign_add(1)

        if tf.greater_equal(inner_count, 2):
            return current_path

    neighs = _neigh_ids(A, node_id)
    neigh_shape = tf.shape(neighs)
    if tf.equal(tf.size(neighs), 0):
        return current_path

    if tf.equal(tf.size(neigh_shape), 0):
        neighs = tf.expand_dims(neighs, 0)
        neigh_shape = tf.shape(neighs)

    for idx in tf.range(neigh_shape[0]):
        neigh_id = neighs[idx]

        # already visited twice and is lowcase
        if tf.logical_and(
            tf.reduce_any(tf.equal(neigh_id, visit_only_once_id)),
            tf.greater(
                tf.reduce_sum(tf.cast(tf.equal(neigh_id, current_path), tf.int32)),
                1,
            ),
        ):
            continue

        _visit2(A, neigh_id, current_path)

    return current_path

The execution is performed in the very same way:

neighs = _neigh_ids(A, start_id)
for idx in tf.range(tf.shape(neighs)[0]):
    neigh_id = neighs[idx]
    _visit2(A, neigh_id, [start_id])

tf.print("Part two: ", count)

It takes a lot of time since we are effectively doing the graph visit, even if the puzzle asked us to just count the number of paths (without modeling them!). Hence for sure, there exists a better solution, very similar to the solution I found to the Day 6 puzzle part two, but for this time the slow solution will be good enough. In fact, after about 20 minutes of computations the output produced is corrected :)

Conclusion

You can see the complete solution in folder 12 on the dedicated Github repository: https://github.com/galeone/tf-aoc.

Solving this problem in TensorFlow demonstrated how the adjacency matrix representation is really handy while working with graphs, but also demonstrated that is impossible to write recursive pure-TensorFlow programs (with `tf.function).

If you missed the articles about the previous days’ solutions, here’s a handy list:

I already solved also part 1 of day 13, but I don’t know if I have the time for solving the puzzles and writing the articles (holidays have reached the end some week ago :D).

So maybe, for the 2021 Advent of Code in pure TensorFlow is all.

For any feedback or comment, please use the Disqus form below - thanks!

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

Related Posts

Fixing the code signing and notarization issues of Unreal Engine (5.3+) projects

Starting from Unreal Engine 5.3, Epic Games added support for the so-called modern Xcode workflow. This workflow allows the Unreal Build Tool (UBT) to be more consistent with the standard Xcode app projects, and to be compliant with the Apple requirements for distributing applications... In theory! 😅 In practice this workflow is flawed: both the code signing and the framework supports are not correctly implemented, making the creation of working apps and their distribution impossible. In this article, we'll go through the problems faced during the packaging, code signing, and notarization of an Unreal Engine application on macOS and end up with the step-by-step process to solve them all.

The (Hidden?) Costs of Vertex AI Resource Pools: A Cautionary Tale

In the article "Custom model training & deployment on Google Cloud using Vertex AI in Go" we explored how to leverage Go to create a resource pool and train a machine learning model using Vertex AI's allocated resources. While this approach offers flexibility, there's a crucial aspect to consider: the cost implications of resource pools. This article details my experience with a sudden price increase in Vertex AI and the hidden culprit – a seemingly innocuous resource pool.

Building a RAG for tabular data in Go with PostgreSQL & Gemini

In this article we explore how to combine a large language model (LLM) with a relational database to allow users to ask questions about their data in a natural way. It demonstrates a Retrieval-Augmented Generation (RAG) system built with Go that utilizes PostgreSQL and pgvector for data storage and retrieval. The provided code showcases the core functionalities. This is an overview of how the "chat with your data" feature of fitsleepinsights.app is being developed.

Using Gemini in a Go application: limits and details

This article explores using Gemini within Go applications via Vertex AI. We'll delve into the limitations encountered, including the model's context window size and regional restrictions. We'll also explore various methods for feeding data to Gemini, highlighting the challenges faced due to these limitations. Finally, we'll briefly introduce RAG (Retrieval-Augmented Generation) as a potential solution, but leave its implementation details for future exploration.

Custom model training & deployment on Google Cloud using Vertex AI in Go

This article shows a different approach to solving the same problem presented in the article AutoML pipeline for tabular data on VertexAI in Go. This time, instead of relying on AutoML we will define the model and the training job ourselves. This is a more advanced usage that allows the experienced machine learning practitioner to have full control on the pipeline from the model definition to the hardware to use for training and deploying. At the end of the article, we will also see how to use the deployed model. All of this, in Go and with the help of Python and Docker for the custom training job definition.

Integrating third-party libraries as Unreal Engine plugins: solving the ABI compatibility issues on Linux when the source code is available

In this article, we will discuss the challenges and potential issues that may arise during the integration process of a third-party library when the source code is available. It will provide guidance on how to handle the compilation and linking of the third-party library, manage dependencies, and resolve compatibility issues. We'll realize a plugin for redis plus plus as a real use case scenario, and we'll see how tough can it be to correctly compile the library for Unreal Engine - we'll solve every problem step by step.

AutoML pipeline for tabular data on VertexAI in Go

In this article, we delve into the development and deployment of tabular models using VertexAI and AutoML with Go, showcasing the actual Go code and sharing insights gained through trial & error and extensive Google research to overcome documentation limitations.

Advent of Code 2022 in pure TensorFlow - Day 12

Solving problem 12 of the AoC 2022 in pure TensorFlow is a great exercise in graph theory and more specifically in using the Breadth-First Search (BFS) algorithm. This problem requires working with a grid of characters representing a graph, and the BFS algorithm allows us to traverse the graph in the most efficient way to solve the problem.

Advent of Code 2022 in pure TensorFlow - Day 11

In this article, we'll show how to solve problem 11 from the Advent of Code 2022 (AoC 2022) using TensorFlow. We'll first introduce the problem and then provide a detailed explanation of our TensorFlow solution. The problem at hand revolves around the interactions of multiple monkeys inspecting items, making decisions based on their worry levels, and following a set of rules.

Advent of Code 2022 in pure TensorFlow - Day 10

Solving problem 10 of the AoC 2022 in pure TensorFlow is an interesting challenge. This problem involves simulating a clock signal with varying frequencies and tracking the state of a signal-strength variable. TensorFlow's ability to handle complex data manipulations, control structures, and its @tf.function decorator for efficient execution makes it a fitting choice for tackling this problem. By utilizing TensorFlow's features such as Dataset transformations, efficient filtering, and tensor operations, we can create a clean and efficient solution to this intriguing puzzle.