Advent of Code 2021 in pure TensorFlow - day 12

Day 12 problem projects us the world of graphs. TensorFlow can be used to work on graphs pretty easily since a graph can be represented as an adjacency matrix, and thus, we can have a tf.Tensor containing our graph. However, the “natural” way of exploring a graph is using recursion, and as we’ll see in this article, this prevents us to solve the problem using a pure TensorFlow program, but we have to work only in eager mode.

Day 12: Passage Pathing

You can click on the title above to read the full text of the puzzle. The TLDR version is:

Our dataset is a list of connections with this format:


where start and end are your start and end points, the upper case letters (e.g A) are the “big caves”, and the lower case letters (e.g. c) are the “small caves”.

The above cave system looks roughly like this:

    /   \
    \   /

The puzzle asks to find the number of distinct paths that start at start, and end at end, knowing that you can visit the “big caves” multiple times, and the small caves only once (per path).

Given these rules, there are 10 paths through this example cave system (graph)


Design phase: part one

The problem asks us to:

  1. Model the relationships between the nodes
  2. Finding the paths from start to end that follow the two rules about the big/small caves

Modeling the relationships between nodes is something we can achieve by mapping the neighboring relationship among nodes using an adjacency matrix. In practice, we can assign an unique index to a node (e.g. “start” = 0, “end” = 1, “A” = 2, …), and each index identifies a row/column in the adjacency matrix. Whenever a node is connected to another, we put a “1” in the correct location.

For example, a possible adjacency matrix for the example graph is

\[\begin{pmatrix} 0&1&1&0&0&0\\ 1&0&1&1&0&1\\ 1&1&0&0&1&1\\ 0&1&0&0&0&0\\ 0&0&1&0&0&0\\ 0&1&1&0&0&0\\ \end{pmatrix}\]

As it can be easily seen the adjacency matrix for a graph without self connections is a hollow matrix that means that all the elements along the diagonal are equal to zero. Moreover, the adjacency matrix is also symmetric (rows and columns can be transposed and the matrix doesn’t change - and it makes sense since if A is connected with B also B is connected with A).

Thus, we first need to create the mapping between the “human-readable” nodes and the IDs, then characterize these IDs. In fact, we have some rules to follow. All the upper case nodes (and thus the corresponding numeric IDs) can be visited only once, while the other can be visited multiple times.

Moreover, the start and end nodes are special, since if we reach the end the path is complete and we can exit from our search algorithm, while the start node shouldn’t be visited more than once (at the beginning and then ignored).

Input pipeline

We create a object for reading the text file line-by-line as usual. We split every line looking at the - separator, and then create the relationships between the human-readable nodes and the indexes of the adjacency matrix

connections ="input").map(
    lambda string: tf.strings.split(string, "-")

# Create a map between human-readable node names and numeric indices
human_to_id = tf.lookup.experimental.MutableHashTable(tf.string, tf.int64, -1)
id_to_human = tf.lookup.experimental.MutableHashTable(tf.int64, tf.string, "")

human_to_id is the hashtable we use to create the mapping between the human-readable node name and a numeric ID. tf.lookup.experimental.MutableHashTable returns the third parameter (-1) when the lookup fails, and thus we can use this feature to check if we already created the mapping for a node.

Adjacency matrix

Since we want to create an adjacency matrix, we need to keep track of the adjacency relations between nodes (the indices we use for setting the 1 in the correct positions of the adjacency matrix), we use a tf.TensorArray for doing it.

idx = tf.Variable(0, dtype=tf.int64)
indices = tf.TensorArray(tf.int64, size=0, dynamic_size=True)
for edge in connections:
    node_i = human_to_id.lookup(edge[0])
    node_j = human_to_id.lookup(edge[1])

    if tf.equal(node_i, -1):
        human_to_id.insert([edge[0]], [idx])
        id_to_human.insert([idx], [edge[0]])
        node_i = tf.identity(idx)
    if tf.equal(node_j, -1):
        human_to_id.insert([edge[1]], [idx])
        id_to_human.insert([idx], [edge[1]])
        node_j = tf.identity(idx)

    ij = tf.convert_to_tensor([node_i, node_j])
    indices = indices.write(indices.size(), ij)

We loop over all the connections, check if the node has already been mapped and, if not, increase the counter (idx) and then create the mapping. Then, we create the relationship between the i and j nodes and save it inside the tf.TensorArray.

At the end of this loop, the indices variable contains all the pairs of coordinates where a connection is, while idx contains the ID of the latest mapped item, which also is the width (height) of the matrix.

We should also note that we created the relationship from i to j, hence we created an upper triangular matrix (a matrix whose elements below the main diagonals are zero). For modeling the full relationship i to j and j to i we can just transpose the matrix and sum it with itself.

indices = indices.stack()
indices = tf.reshape(indices, (-1, 2))
A = tf.tensor_scatter_nd_update(
    tf.zeros((idx, idx), dtype=tf.int64),
    tf.repeat(tf.cast(1, tf.int64), tf.shape(indices)[0]),
A = A + tf.transpose(A)

Here we go! A is the adjacency matrix that completely models the graph.

The only thing missing is the modeling of the constraints. We need a list-like object (a tf.Tensor) containing all the IDs of the nodes that we can visit once, and the difference between this set of IDs and the full list of IDs for getting the list of the IDs we can visit multiple times.

keys = human_to_id.export()[0]
visit_only_once_human = tf.gather(
            tf.cast(tf.strings.regex_full_match(keys, "[a-z]+?"), tf.int32)
            * tf.range(tf.shape(keys)[0]),
visit_only_once_human = tf.squeeze(visit_only_once_human)
visit_only_once_id = human_to_id.lookup(visit_only_once_human)

# Visit multiple times = {keys} - {only once}
visit_multiple_times_human = tf.sparse.to_dense(
        tf.reshape(keys, (1, -1)), tf.reshape(visit_only_once_human, (1, -1))
visit_multiple_times_human = tf.squeeze(visit_multiple_times_human)
visit_multiple_times_id = human_to_id.lookup(visit_multiple_times_human)

It’s interesting how we used regular expressions in TensorFlow with the tf.strings.regex_full_match function for searching for nodes all in lower case ([a-z]+? - visit only once).

Moreover, we used the set difference together with the conversion from sparse to dense to get a tf.Tensor containing all the upper-case nodes (visit multiple times).

Visiting the graph

The natural way of facing this problem is by using recursion. In fact, the problem can be modeled as follows

Concatenate the input node to the current_path.

Is the current node the end node? If yes, we found the path. Increment by 1 the counter of the paths. Return.

Otherwise, find the neighborhood of the current node.

For every neighbor verify if the neighbor is in the “visit only once” list and it’s in the current path. If yes, skip this neighbor.

Otherwise, visit the neighbor (recursive step).

Being the adjacency matrix symmetric, the step of “find the neighborhood of the current node” can be simply modeled as a search for the 1 in the row whose id is node_id.

def _neigh_ids(A, node_id):
    return tf.squeeze(tf.where(tf.equal(A[node_id, :], 1)))

Since the proposed solution uses the recursion we cannot decorate the visit function with @tf.function because tf.function does not support recursion. Thus, we are forced to solve this problem in eager mode.

Going back to the problem, the puzzle asks us to count the number of paths hence we need a tf.Variable for counting the number of paths found. Moreover, instead of repeating on every iteration a lookup in the hastable, we can extract the IDs of the start and end nodes.

start_id, end_id = human_to_id.lookup(["start", "end"])
count = tf.Variable(0, dtype=tf.int64)

The visit function can be defined by implementing precisely what has been proposed in the algorithm described above.

def _visit(A: tf.Tensor, node_id: tf.Tensor, path: tf.Tensor):
    current_path = tf.concat([path, [node_id]], axis=0)
    if tf.equal(node_id, end_id):
        return current_path

    neighs = _neigh_ids(A, node_id)
    neigh_shape = tf.shape(neighs)
    if tf.equal(tf.size(neighs), 0):
        return current_path

    if tf.equal(tf.size(neigh_shape), 0):
        neighs = tf.expand_dims(neighs, 0)
        neigh_shape = tf.shape(neighs)

    for idx in tf.range(neigh_shape[0]):
        neigh_id = neighs[idx]
        if tf.logical_and(
            tf.reduce_any(tf.equal(neigh_id, visit_only_once_id)),
            tf.reduce_any(tf.equal(neigh_id, current_path)),
        # Recursion step
        _visit(A, neigh_id, current_path)
    return current_path


To call the _visit function we need:

  • A the adjacency matrix
  • node_id a node for starting the visit
  • path the initial path (a tf.Tensor containing only the ID of the start node).
# All the paths starts from start
neighs = _neigh_ids(A, start_id)
for idx in tf.range(tf.shape(neighs)[0]):
    neigh_id = neighs[idx]
    _visit(A, neigh_id, [start_id])

tf.print("Part one: ", count)

Here we go! Part one is solved! Let’s see what part two is about.

Design phase: part 2

The puzzle relaxes a constraint: in a path, we can pass twice for a single small cave. For example start,A,b,A,b,A,c,A,end is a valid path because we visited b twice and c once.

Given this relaxed constraint, how many paths through the graph are there?

Part two implementation

It is possible to simply add the new constraint to the previous algorithm. We can define an inner_count tf.Variable for counting the number of times small cave in the current path has been visited. When this counter is greater or equal to 2 then the path we are following is invalid and we can return it.

inner_count = tf.Variable(0)

def _visit2(A: tf.Tensor, node_id: tf.Tensor, path: tf.Tensor):
    current_path = tf.concat([path, [node_id]], axis=0)

    # Skip start
    if tf.equal(node_id, start_id):
        return current_path

    # Success on end node
    if tf.equal(node_id, end_id):
        # paths.append(current_path)
        return current_path

    # More than 2 lowercase visited twice
    visited, visited_idx, visited_count = tf.unique_with_counts(current_path)
    visited = tf.gather_nd(visited, tf.where(tf.greater(visited_count, 1)))
    for idx in tf.range(tf.shape(visited)[0]):
        if tf.reduce_any(tf.equal(visited[idx], visit_only_once_id)):

        if tf.greater_equal(inner_count, 2):
            return current_path

    neighs = _neigh_ids(A, node_id)
    neigh_shape = tf.shape(neighs)
    if tf.equal(tf.size(neighs), 0):
        return current_path

    if tf.equal(tf.size(neigh_shape), 0):
        neighs = tf.expand_dims(neighs, 0)
        neigh_shape = tf.shape(neighs)

    for idx in tf.range(neigh_shape[0]):
        neigh_id = neighs[idx]

        # already visited twice and is lowcase
        if tf.logical_and(
            tf.reduce_any(tf.equal(neigh_id, visit_only_once_id)),
                tf.reduce_sum(tf.cast(tf.equal(neigh_id, current_path), tf.int32)),

        _visit2(A, neigh_id, current_path)

    return current_path

The execution is performed in the very same way:

neighs = _neigh_ids(A, start_id)
for idx in tf.range(tf.shape(neighs)[0]):
    neigh_id = neighs[idx]
    _visit2(A, neigh_id, [start_id])

tf.print("Part two: ", count)

It takes a lot of time since we are effectively doing the graph visit, even if the puzzle asked us to just count the number of paths (without modeling them!). Hence for sure, there exists a better solution, very similar to the solution I found to the Day 6 puzzle part two, but for this time the slow solution will be good enough. In fact, after about 20 minutes of computations the output produced is corrected :)


You can see the complete solution in folder 12 on the dedicated Github repository:

Solving this problem in TensorFlow demonstrated how the adjacency matrix representation is really handy while working with graphs, but also demonstrated that is impossible to write recursive pure-TensorFlow programs (with `tf.function).

If you missed the articles about the previous days’ solutions, here’s a handy list:

I already solved also part 1 of day 13, but I don’t know if I have the time for solving the puzzles and writing the articles (holidays have reached the end some week ago :D).

So maybe, for the 2021 Advent of Code in pure TensorFlow is all.

For any feedback or comment, please use the Disqus form below - thanks!

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

Related Posts

Advent of Code 2021 in pure TensorFlow - day 11

The Day 11 problem has lots in common with Day 9. In fact, will re-use some computer vision concepts like the pixel neighborhood, and we'll be able to solve both parts in pure TensorFlow by using only a tf.queue as a support data structure.

Advent of Code 2021 in pure TensorFlow - day 10

The day 10 challenge projects us in the world of syntax checkers and autocomplete tools. In this article, we'll see how TensorFlow can be used as a generic programming language for implementing a toy syntax checker and autocomplete.

Advent of Code 2021 in pure TensorFlow - day 9

The day 9 challenge can be seen as a computer vision problem. TensorFlow contains some computer vision utilities that we'll use - like the image gradient - but it's not a complete framework for computer vision (like OpenCV). Anyway, the framework offers primitive data types like tf.TensorArray and tf.queue that we can use for implementing a flood-fill algorithm in pure TensorFlow and solve the problem.

Advent of Code 2021 in pure TensorFlow - day 8

The day 8 challenge is, so far, the most boring challenge faced 😅. Designing a TensorFlow program - hence reasoning in graph mode - would have been too complicated since the solution requires lots of conditional branches. A known AutoGraph limitation forbids variables to be defined in only one branch of a TensorFlow conditional if the variable is used afterward. That's why the solution is in pure TensorFlow eager.

Advent of Code 2021 in pure TensorFlow - day 7

The day 7 challenge is easily solvable with the help of the TensorFlow ragged tensors. In this article, we'll solve the puzzle while learning what ragged tensors are and how to use them.

Advent of Code 2021 in pure TensorFlow - day 6

The day 6 challenge has been the first one that obliged me to completely redesign for part 2 the solution I developed for part 1. For this reason, in this article, we'll see two different approaches to the problem. The former will be computationally inefficient but will completely model the problem, hence it will be easy to understand. The latter, instead, will be completely different and it will focus on the puzzle goal instead of the complete modeling.

Advent of Code 2021 in pure TensorFlow - day 5

The day 5 challenge is easily solvable in pure TensorFlow thanks to its support for various distance functions and the power of the tf.math package. The problem only requires some basic math knowledge to be completely solved - and a little bit of computer vision experience doesn't hurt.

Advent of Code 2021 in pure TensorFlow - day 4

Using tensors for representing and manipulating data is very convenient. This representation allows changing shape, organizing, and applying generic transformations to the data. TensorFlow - by design - executes all the data manipulation in parallel whenever possible. The day 4 challenge is a nice showcase of how choosing the correct data representation can easily simplify a problem.

Advent of Code 2021 in pure TensorFlow - day 3

A Solution to the AoC day 3 puzzle in pure TensorFlow. This challenge allows us to explore the TensorArray data type and find their limitations when used inside a static-graph context. We'll also use a tf.function experimental (but very useful) feature for avoiding useless retraces and reusing the same graph with tensors of different shapes.

Advent of Code 2021 in pure TensorFlow - day 2

A Solution to the AoC day 2 puzzle in pure TensorFlow. How to use Enums in TensorFlow programs and the limitations of tf.Tensor used for type annotation