Advent of Code 2021 in pure TensorFlow - day 12

Day 12 problem projects us the world of graphs. TensorFlow can be used to work on graphs pretty easily since a graph can be represented as an adjacency matrix, and thus, we can have a tf.Tensor containing our graph. However, the “natural” way of exploring a graph is using recursion, and as we’ll see in this article, this prevents us to solve the problem using a pure TensorFlow program, but we have to work only in eager mode.

Day 12: Passage Pathing

You can click on the title above to read the full text of the puzzle. The TLDR version is:

Our dataset is a list of connections with this format:

start-A
start-b
A-c
A-b
b-d
A-end
b-end

where start and end are your start and end points, the upper case letters (e.g A) are the “big caves”, and the lower case letters (e.g. c) are the “small caves”.

The above cave system looks roughly like this:

    start
    /   \
c--A-----b--d
    \   /
     end

The puzzle asks to find the number of distinct paths that start at start, and end at end, knowing that you can visit the “big caves” multiple times, and the small caves only once (per path).

Given these rules, there are 10 paths through this example cave system (graph)

start,A,b,A,c,A,end
start,A,b,A,end
start,A,b,end
start,A,c,A,b,A,end
start,A,c,A,b,end
start,A,c,A,end
start,A,end
start,b,A,c,A,end
start,b,A,end
start,b,end

Design phase: part one

The problem asks us to:

Model the relationships between the nodes
Finding the paths from start to end that follow the two rules about the big/small caves

Modeling the relationships between nodes is something we can achieve by mapping the neighboring relationship among nodes using an adjacency matrix. In practice, we can assign an unique index to a node (e.g. “start” = 0, “end” = 1, “A” = 2, …), and each index identifies a row/column in the adjacency matrix. Whenever a node is connected to another, we put a “1” in the correct location.

For example, a possible adjacency matrix for the example graph is

\[\begin{pmatrix} 0&1&1&0&0&0\\ 1&0&1&1&0&1\\ 1&1&0&0&1&1\\ 0&1&0&0&0&0\\ 0&0&1&0&0&0\\ 0&1&1&0&0&0\\ \end{pmatrix}\]

As it can be easily seen the adjacency matrix for a graph without self connections is a hollow matrix that means that all the elements along the diagonal are equal to zero. Moreover, the adjacency matrix is also symmetric (rows and columns can be transposed and the matrix doesn’t change - and it makes sense since if A is connected with B also B is connected with A).

Thus, we first need to create the mapping between the “human-readable” nodes and the IDs, then characterize these IDs. In fact, we have some rules to follow. All the upper case nodes (and thus the corresponding numeric IDs) can be visited only once, while the other can be visited multiple times.

Moreover, the start and end nodes are special, since if we reach the end the path is complete and we can exit from our search algorithm, while the start node shouldn’t be visited more than once (at the beginning and then ignored).

Input pipeline

We create a tf.data.Dataset object for reading the text file line-by-line as usual. We split every line looking at the - separator, and then create the relationships between the human-readable nodes and the indexes of the adjacency matrix

connections = tf.data.TextLineDataset("input").map(
    lambda string: tf.strings.split(string, "-")
)

# Create a map between human-readable node names and numeric indices
human_to_id = tf.lookup.experimental.MutableHashTable(tf.string, tf.int64, -1)
id_to_human = tf.lookup.experimental.MutableHashTable(tf.int64, tf.string, "")

human_to_id is the hashtable we use to create the mapping between the human-readable node name and a numeric ID. tf.lookup.experimental.MutableHashTable returns the third parameter (-1) when the lookup fails, and thus we can use this feature to check if we already created the mapping for a node.

Adjacency matrix

Since we want to create an adjacency matrix, we need to keep track of the adjacency relations between nodes (the indices we use for setting the 1 in the correct positions of the adjacency matrix), we use a tf.TensorArray for doing it.

idx = tf.Variable(0, dtype=tf.int64)
indices = tf.TensorArray(tf.int64, size=0, dynamic_size=True)
for edge in connections:
    node_i = human_to_id.lookup(edge[0])
    node_j = human_to_id.lookup(edge[1])

    if tf.equal(node_i, -1):
        human_to_id.insert([edge[0]], [idx])
        id_to_human.insert([idx], [edge[0]])
        node_i = tf.identity(idx)
        idx.assign_add(1)
    if tf.equal(node_j, -1):
        human_to_id.insert([edge[1]], [idx])
        id_to_human.insert([idx], [edge[1]])
        node_j = tf.identity(idx)
        idx.assign_add(1)

    ij = tf.convert_to_tensor([node_i, node_j])
    indices = indices.write(indices.size(), ij)

We loop over all the connections, check if the node has already been mapped and, if not, increase the counter (idx) and then create the mapping. Then, we create the relationship between the i and j nodes and save it inside the tf.TensorArray.

At the end of this loop, the indices variable contains all the pairs of coordinates where a connection is, while idx contains the ID of the latest mapped item, which also is the width (height) of the matrix.

We should also note that we created the relationship from i to j, hence we created an upper triangular matrix (a matrix whose elements below the main diagonals are zero). For modeling the full relationship i to j and j to i we can just transpose the matrix and sum it with itself.

indices = indices.stack()
indices = tf.reshape(indices, (-1, 2))
A = tf.tensor_scatter_nd_update(
    tf.zeros((idx, idx), dtype=tf.int64),
    indices,
    tf.repeat(tf.cast(1, tf.int64), tf.shape(indices)[0]),
)
A = A + tf.transpose(A)

Here we go! A is the adjacency matrix that completely models the graph.

The only thing missing is the modeling of the constraints. We need a list-like object (a tf.Tensor) containing all the IDs of the nodes that we can visit once, and the difference between this set of IDs and the full list of IDs for getting the list of the IDs we can visit multiple times.

keys = human_to_id.export()[0]
visit_only_once_human = tf.gather(
    keys,
    tf.where(
        tf.equal(
            tf.range(tf.shape(keys)[0]),
            tf.cast(tf.strings.regex_full_match(keys, "[a-z]+?"), tf.int32)
            * tf.range(tf.shape(keys)[0]),
        )
    ),
)
visit_only_once_human = tf.squeeze(visit_only_once_human)
visit_only_once_id = human_to_id.lookup(visit_only_once_human)

# Visit multiple times = {keys} - {only once}
visit_multiple_times_human = tf.sparse.to_dense(
    tf.sets.difference(
        tf.reshape(keys, (1, -1)), tf.reshape(visit_only_once_human, (1, -1))
    )
)
visit_multiple_times_human = tf.squeeze(visit_multiple_times_human)
visit_multiple_times_id = human_to_id.lookup(visit_multiple_times_human)

It’s interesting how we used regular expressions in TensorFlow with the tf.strings.regex_full_match function for searching for nodes all in lower case ([a-z]+? - visit only once).

Moreover, we used the set difference together with the conversion from sparse to dense to get a tf.Tensor containing all the upper-case nodes (visit multiple times).

Visiting the graph

The natural way of facing this problem is by using recursion. In fact, the problem can be modeled as follows

Concatenate the input node to the current_path.

Is the current node the end node? If yes, we found the path. Increment by 1 the counter of the paths. Return.

Otherwise, find the neighborhood of the current node.

For every neighbor verify if the neighbor is in the “visit only once” list and it’s in the current path. If yes, skip this neighbor.

Otherwise, visit the neighbor (recursive step).

Being the adjacency matrix symmetric, the step of “find the neighborhood of the current node” can be simply modeled as a search for the 1 in the row whose id is node_id.

@tf.function
def _neigh_ids(A, node_id):
    return tf.squeeze(tf.where(tf.equal(A[node_id, :], 1)))

Since the proposed solution uses the recursion we cannot decorate the visit function with @tf.function because tf.function does not support recursion. Thus, we are forced to solve this problem in eager mode.

Going back to the problem, the puzzle asks us to count the number of paths hence we need a tf.Variable for counting the number of paths found. Moreover, instead of repeating on every iteration a lookup in the hastable, we can extract the IDs of the start and end nodes.

start_id, end_id = human_to_id.lookup(["start", "end"])
count = tf.Variable(0, dtype=tf.int64)

The visit function can be defined by implementing precisely what has been proposed in the algorithm described above.

def _visit(A: tf.Tensor, node_id: tf.Tensor, path: tf.Tensor):
    current_path = tf.concat([path, [node_id]], axis=0)
    if tf.equal(node_id, end_id):
        count.assign_add(1)
        return current_path

    neighs = _neigh_ids(A, node_id)
    neigh_shape = tf.shape(neighs)
    if tf.equal(tf.size(neighs), 0):
        return current_path

    if tf.equal(tf.size(neigh_shape), 0):
        neighs = tf.expand_dims(neighs, 0)
        neigh_shape = tf.shape(neighs)

    for idx in tf.range(neigh_shape[0]):
        neigh_id = neighs[idx]
        if tf.logical_and(
            tf.reduce_any(tf.equal(neigh_id, visit_only_once_id)),
            tf.reduce_any(tf.equal(neigh_id, current_path)),
        ):
            continue
        # Recursion step
        _visit(A, neigh_id, current_path)
    return current_path

Execution

To call the _visit function we need:

A the adjacency matrix
node_id a node for starting the visit
path the initial path (a tf.Tensor containing only the ID of the start node).

# All the paths starts from start
neighs = _neigh_ids(A, start_id)
for idx in tf.range(tf.shape(neighs)[0]):
    neigh_id = neighs[idx]
    _visit(A, neigh_id, [start_id])

tf.print("Part one: ", count)

Here we go! Part one is solved! Let’s see what part two is about.

Design phase: part 2

The puzzle relaxes a constraint: in a path, we can pass twice for a single small cave. For example start,A,b,A,b,A,c,A,end is a valid path because we visited b twice and c once.

Given this relaxed constraint, how many paths through the graph are there?

Part two implementation

It is possible to simply add the new constraint to the previous algorithm. We can define an inner_count tf.Variable for counting the number of times small cave in the current path has been visited. When this counter is greater or equal to 2 then the path we are following is invalid and we can return it.

count.assign(0)
inner_count = tf.Variable(0)

def _visit2(A: tf.Tensor, node_id: tf.Tensor, path: tf.Tensor):
    current_path = tf.concat([path, [node_id]], axis=0)

    # Skip start
    if tf.equal(node_id, start_id):
        return current_path

    # Success on end node
    if tf.equal(node_id, end_id):
        # paths.append(current_path)
        count.assign_add(1)
        return current_path

    # More than 2 lowercase visited twice
    visited, visited_idx, visited_count = tf.unique_with_counts(current_path)
    visited = tf.gather_nd(visited, tf.where(tf.greater(visited_count, 1)))
    inner_count.assign(0)
    for idx in tf.range(tf.shape(visited)[0]):
        if tf.reduce_any(tf.equal(visited[idx], visit_only_once_id)):
            inner_count.assign_add(1)

        if tf.greater_equal(inner_count, 2):
            return current_path

    neighs = _neigh_ids(A, node_id)
    neigh_shape = tf.shape(neighs)
    if tf.equal(tf.size(neighs), 0):
        return current_path

    if tf.equal(tf.size(neigh_shape), 0):
        neighs = tf.expand_dims(neighs, 0)
        neigh_shape = tf.shape(neighs)

    for idx in tf.range(neigh_shape[0]):
        neigh_id = neighs[idx]

        # already visited twice and is lowcase
        if tf.logical_and(
            tf.reduce_any(tf.equal(neigh_id, visit_only_once_id)),
            tf.greater(
                tf.reduce_sum(tf.cast(tf.equal(neigh_id, current_path), tf.int32)),
                1,
            ),
        ):
            continue

        _visit2(A, neigh_id, current_path)

    return current_path

The execution is performed in the very same way:

neighs = _neigh_ids(A, start_id)
for idx in tf.range(tf.shape(neighs)[0]):
    neigh_id = neighs[idx]
    _visit2(A, neigh_id, [start_id])

tf.print("Part two: ", count)

It takes a lot of time since we are effectively doing the graph visit, even if the puzzle asked us to just count the number of paths (without modeling them!). Hence for sure, there exists a better solution, very similar to the solution I found to the Day 6 puzzle part two, but for this time the slow solution will be good enough. In fact, after about 20 minutes of computations the output produced is corrected :)

Conclusion

You can see the complete solution in folder 12 on the dedicated Github repository: https://github.com/galeone/tf-aoc.

Solving this problem in TensorFlow demonstrated how the adjacency matrix representation is really handy while working with graphs, but also demonstrated that is impossible to write recursive pure-TensorFlow programs (with `tf.function).

If you missed the articles about the previous days’ solutions, here’s a handy list:

I already solved also part 1 of day 13, but I don’t know if I have the time for solving the puzzles and writing the articles (holidays have reached the end some week ago :D).

So maybe, for the 2021 Advent of Code in pure TensorFlow is all.

For any feedback or comment, please use the Disqus form below - thanks!

Advent of Code 2021 in pure TensorFlow - day 12

Day 12: Passage Pathing

Design phase: part one

Input pipeline

Adjacency matrix

Visiting the graph

Execution

Design phase: part 2

Part two implementation

Conclusion

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

From Vertex AI SDK to Google Gen AI SDK: Service Account Authentication for Python and Go

Getting back to the EU: from Google Cloud to Self-Hosted EU Infrastructure

Using AI for Coding: My Journey with Cline and Large Language Models

Fixing the code signing and notarization issues of Unreal Engine (5.3+) projects

The (Hidden?) Costs of Vertex AI Resource Pools: A Cautionary Tale

Building a RAG for tabular data in Go with PostgreSQL & Gemini

Using Gemini in a Go application: limits and details

Custom model training & deployment on Google Cloud using Vertex AI in Go

Integrating third-party libraries as Unreal Engine plugins: solving the ABI compatibility issues on Linux when the source code is available

AutoML pipeline for tabular data on VertexAI in Go

Advent of Code 2021 in pure TensorFlow - day 12

Day 12: Passage Pathing

Design phase: part one

Input pipeline

Adjacency matrix

Visiting the graph

Execution

Design phase: part 2

Part two implementation

Conclusion

Don't you want to miss the next article? Do you want to be kept updated?Subscribe to the newsletter!

If you liked the article: share it!

Related Posts

From Vertex AI SDK to Google Gen AI SDK: Service Account Authentication for Python and Go

Getting back to the EU: from Google Cloud to Self-Hosted EU Infrastructure

Using AI for Coding: My Journey with Cline and Large Language Models

Fixing the code signing and notarization issues of Unreal Engine (5.3+) projects

The (Hidden?) Costs of Vertex AI Resource Pools: A Cautionary Tale

Building a RAG for tabular data in Go with PostgreSQL & Gemini

Using Gemini in a Go application: limits and details

Custom model training & deployment on Google Cloud using Vertex AI in Go

Integrating third-party libraries as Unreal Engine plugins: solving the ABI compatibility issues on Linux when the source code is available

AutoML pipeline for tabular data on VertexAI in Go

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!