Advent of Code 2021 in pure TensorFlow - day 1


Even if it’s a bit late* I decided to start solving all the puzzles of the Advent of Code (AoC) using TensorFlow.

One may be wondering why solving the AoC puzzles using TensorFlow and what it means to solve them with TensorFlow. First of all, it can be a nice challenge for myself, to see if I’m able to design pure TensorFlow solutions for these challenges. But perhaps more important than the personal challenge, is the demonstration of the power of the framework. In fact, solving a coding puzzle with TensorFlow doesn’t mean throwing fancy machine learning stuff (without any reason) to the problem to solve it. On the contrary, I want to demonstrate the flexibility - and the limitations - of the framework, showing that TensorFlow can be used to solve any kind of problem and that the produced solutions have tons of advantages with respect to the solutions developed using any other programming languages. In fact, I see TensorFlow more as a programming language than as a mere framework. This is a strong statement, but it’s justified (IMHO) by the different way of reasoning one must follow when designing pure-TensorFlow programs.

TensorFlow programs are self-contained descriptions of computation. The inference of a trained machine learning model is a TensorFlow program, but the framework is so flexible that allows describing - and exporting! - generic programs.

I’ll try to write a short article for every problem I solve using TensorFlow (so this is the beginning of a series, I hope!), highlighting the peculiarities of the provided solutions and explaining how to reason when creating TensorFlow programs.

*Today (12 Dec 2021) is the 11th day of the Advent of Code in my timezone (UTC+1).

Advent of Code

To give a bit of context, let’s recap what AoC is. From the about section:

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like.

This is pretty much what we need to know. Every year, from the 1st to the 25th of December we do have a puzzle to solve. As I said, I’m a bit late to the party since I decided to start this AoC journey only today, but after all, every cloud has a silver lining. In fact, I can safely stream all the solutions I write without ruining the challenge to anyone, since the various daily leaderboards are all full.

Day 1: Sonar Sweep: part one

You can click on the title above to read the full text of the puzzle. The TLDR version is:

You are given with a set of numbers like

199
200
208
210
200
207
240
269
260
263

the puzzle goal is count the number of times there’s an increase. In the example above, the changes are

199 (N/A - no previous measurement)
200 (increased)
208 (increased)
210 (increased)
200 (decreased)
207 (increased)
240 (increased)
269 (increased)
260 (decreased)
263 (increased)

so the answer to the puzzle is 7.

We have all the information required to solve the task. On the AoC page, we can download the real input, that’s a text file with the same format presented above, that we can put next to the TensorFlow program we are implementing.

Design phase

The nature of this task is sequential. We should read the file line by line, convert the line into a number, and compare the i-th number with the (i-1)th. If there’s an increase, add one to the counter.

Hence we need a counter, and another variable to keep track of the (i-1)th number, while we process the i-th. Having variables means that our TensorFlow program needs to create a state. Every time the word state occurs during the design phase of a TensorFlow program, this article should come to mind.

A state is nothing but a tf.Variable, but a tf.Variable in a TensorFlow program (that, I want to stress to the reader, is a description of the computation) is a node in a graph, that should be declared once. A tf.Variable behaves like a Python variable when in eager mode, but the TensorFlow programs are (and must) be executed in graph mode, and the tf.Variable are nodes that can be declared once and repeatedly used.

Input pipeline

Being sequential, we can use all the input pipelines offered by TensorFlow to efficiently read the data, process, and loop over it.

dataset = tf.data.TextLineDataset("input").map(
    lambda string: tf.strings.to_number(string, out_type=tf.int64)
)

Straightforward. We read the “input” file using the TextLineDataset object. This tf.data.Dataset specialization automatically creates a new element for every new line in the file. Through the map method, we apply the conversion from tf.string to tf.int64.

TensorFlow, differently from Python, is strictly statically typed. Every operation should be performed over the same types, even implicit conversions are not allowed, hence casts and type conversions must be widely used.

Counting increments

class IncreasesCounter(tf.Module):
    """Stateful counter. Counts the number of "increases"."""

    def __init__(self):
        self._count = tf.Variable(0, trainable=False, dtype=tf.int64)
        self._prev = tf.Variable(0, trainable=False, dtype=tf.int64)

    @tf.function
    def __call__(self, dataset: tf.data.Dataset) -> tf.Tensor:
        """
        Args:
            dataset: the dataset containing the ordered sequence of numbers
                     to process.
        Returns:
            The number of increases. tf.Tensor, dtype=tf.int64
        """
        self._prev.assign(next(iter(dataset.take(1))))
        for number in dataset.skip(1):
            if tf.greater(number, self._prev):
                self._count.assign_add(1)
            self._prev.assign(number)
        return self._count

The IncreaseCounter class is a complete TensorFlow program. In the init we declare and initialize the status variables, and in the __call__ method we implement the logic required by the puzzle. Note that to be a TensorFlow program, the method must be decorated with @tf.function.

Peculiarities:

  1. The assignment must be performed using the assign method. Using the = operator will overwrite the Python variable, and not perform the assignment operation in the graph!
  2. All the comparisons like > are better written using their TensorFlow equivalent (e.g tf.greater). Autograph can convert them (you could write >), but it’s less idiomatic and I recommend to do not relying upon the automatic conversion, for having full control.
  3. Extracting the first element from the dataset it’s a bit “strange” since it requires to
    • Create a dataset with a single element .take(1)
    • Create an iterator from the dataset object (iter)
    • Call next over the iterator to extract the element.
  4. To skip the element assigned in the self._prev variable before the loop execution, we need to create another dataset that starts from the second element, by calling skip(1) on the dataset object.

Execution

counter = IncreasesCounter()
increases = counter(dataset)
tf.print("[part one] increases: ", increases)

Just create an instance of the IncreaseCounter and call it over the dataset previously created. Note that we do use tf.print and not print. tf.print is the operation to use, because it works also in graph mode, while print is executed only during the tracing phase (and in eager mode, which we don’t want to use).

The execution gives the correct result :) and this brings us to part 2.

Day 1: Sonar Sweep: part two

TLDR: instead of considering the single values, consider a tree-numbers sliding window. The example above now becomes:

199  A
200  A B
208  A B C
210    B C D
200  E   C D
207  E F   D
240  E F G
269    F G H
260      G H
263        H

Note: the input doesn’t change, these A B C, and so on are here only to visualize the sliding windows. The goal is to sum all the numbers in a window (e.g. 199+200+208 for windows A) and compare the sum with the sum of the previous sliding window.

A: 607 (N/A - no previous sum)
B: 618 (increased)
C: 618 (no change)
D: 617 (decreased)
E: 647 (increased)
F: 716 (increased)
G: 769 (increased)
H: 792 (increased)

In this case the answer is 5.

Design phase - part two

We already have a TensorFlow program that can detect increases of adjacent numbers in a dataset. So, we can just feed to the program a different output to get the correct result.

TensorFlow offers us several functions for working with datasets. As we’ve seen, it is possible to skip elements and create a new dataset, apply transformation function with map, and so on. Thus, we can solve this challenge by creating a dataset of the resulting sums of the various sliding windows.

Input pipeline - part two

The idea is to create 3 different datasets by shifting by 1 element every time. Create batches of 3 elements (the sliding windows), and them sum all the values in the batch (window).

The 3 datasets can be merged interleaving the values, using the order 1,2,3, 1,2,3 and so on. This order means: pick the i-th element from the first dataset, then pick the i-th element from the second dataset, then pick the i-th element from the third dataset, then increment i. Repeat until all the dataset consumed all the elements.

datasets = [dataset, dataset.skip(1), dataset.skip(2)]
for idx, dataset in enumerate(datasets):
    datasets[idx] = dataset.batch(3, drop_remainder=True).map(tf.reduce_sum)

interleaved_dataset = tf.data.Dataset.choose_from_datasets(
    datasets, tf.data.Dataset.range(3).repeat()
)

Execution - part two

The instance counter of IncreaseCounter has a state, hance we can’t re-use it because we haven’t added a method to reset the state. Thus, we need to create a new instance and pass the interleaved_dataset to get the correct result.

counter = IncreasesCounter()
increases = counter(interleaved_dataset)
tf.print("[part two] increases: ", increases)

The first puzzle is completely solved :)

Conclusion

Solving the AoC puzzles with TensorFlow can be not just fun (come on, it’s a nice challenge thinking about all these nuances :)), but it can be also a good way to design very efficient solutions. In fact, I still haven’t spent some word about the advantages of these implementations, but there are many.

  • The solution can run on any hardware. If you have a supported GPU, it runs on it.
  • Any operation that can be executed in parallel, is automatically parallelized by the framework.
  • These solutions are language agnostic. Yes, we designed and executed them in Python, but we could export them as SavedModel, and re-use the same logic in any other programming language since all we need is the TensorFlow C runtime. For example, a SavedModel of this program can be executed in Go using tfgo.

I’m doing this for fun (and I’m having fun, really), so expect another article for day 2 coming soon!

For any feedback or comment, please use the Disqus form below - thanks!

PS: I’m posting all the solutions also on GitHub, you can find them here: https://github.com/galeone/tf-aoc.

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

Related Posts

Fixing the code signing and notarization issues of Unreal Engine (5.3+) projects

Starting from Unreal Engine 5.3, Epic Games added support for the so-called modern Xcode workflow. This workflow allows the Unreal Build Tool (UBT) to be more consistent with the standard Xcode app projects, and to be compliant with the Apple requirements for distributing applications... In theory! 😅 In practice this workflow is flawed: both the code signing and the framework supports are not correctly implemented, making the creation of working apps and their distribution impossible. In this article, we'll go through the problems faced during the packaging, code signing, and notarization of an Unreal Engine application on macOS and end up with the step-by-step process to solve them all.

The (Hidden?) Costs of Vertex AI Resource Pools: A Cautionary Tale

In the article "Custom model training & deployment on Google Cloud using Vertex AI in Go" we explored how to leverage Go to create a resource pool and train a machine learning model using Vertex AI's allocated resources. While this approach offers flexibility, there's a crucial aspect to consider: the cost implications of resource pools. This article details my experience with a sudden price increase in Vertex AI and the hidden culprit – a seemingly innocuous resource pool.

Building a RAG for tabular data in Go with PostgreSQL & Gemini

In this article we explore how to combine a large language model (LLM) with a relational database to allow users to ask questions about their data in a natural way. It demonstrates a Retrieval-Augmented Generation (RAG) system built with Go that utilizes PostgreSQL and pgvector for data storage and retrieval. The provided code showcases the core functionalities. This is an overview of how the "chat with your data" feature of fitsleepinsights.app is being developed.

Using Gemini in a Go application: limits and details

This article explores using Gemini within Go applications via Vertex AI. We'll delve into the limitations encountered, including the model's context window size and regional restrictions. We'll also explore various methods for feeding data to Gemini, highlighting the challenges faced due to these limitations. Finally, we'll briefly introduce RAG (Retrieval-Augmented Generation) as a potential solution, but leave its implementation details for future exploration.

Custom model training & deployment on Google Cloud using Vertex AI in Go

This article shows a different approach to solving the same problem presented in the article AutoML pipeline for tabular data on VertexAI in Go. This time, instead of relying on AutoML we will define the model and the training job ourselves. This is a more advanced usage that allows the experienced machine learning practitioner to have full control on the pipeline from the model definition to the hardware to use for training and deploying. At the end of the article, we will also see how to use the deployed model. All of this, in Go and with the help of Python and Docker for the custom training job definition.

Integrating third-party libraries as Unreal Engine plugins: solving the ABI compatibility issues on Linux when the source code is available

In this article, we will discuss the challenges and potential issues that may arise during the integration process of a third-party library when the source code is available. It will provide guidance on how to handle the compilation and linking of the third-party library, manage dependencies, and resolve compatibility issues. We'll realize a plugin for redis plus plus as a real use case scenario, and we'll see how tough can it be to correctly compile the library for Unreal Engine - we'll solve every problem step by step.

AutoML pipeline for tabular data on VertexAI in Go

In this article, we delve into the development and deployment of tabular models using VertexAI and AutoML with Go, showcasing the actual Go code and sharing insights gained through trial & error and extensive Google research to overcome documentation limitations.

Advent of Code 2022 in pure TensorFlow - Day 12

Solving problem 12 of the AoC 2022 in pure TensorFlow is a great exercise in graph theory and more specifically in using the Breadth-First Search (BFS) algorithm. This problem requires working with a grid of characters representing a graph, and the BFS algorithm allows us to traverse the graph in the most efficient way to solve the problem.

Advent of Code 2022 in pure TensorFlow - Day 11

In this article, we'll show how to solve problem 11 from the Advent of Code 2022 (AoC 2022) using TensorFlow. We'll first introduce the problem and then provide a detailed explanation of our TensorFlow solution. The problem at hand revolves around the interactions of multiple monkeys inspecting items, making decisions based on their worry levels, and following a set of rules.

Advent of Code 2022 in pure TensorFlow - Day 10

Solving problem 10 of the AoC 2022 in pure TensorFlow is an interesting challenge. This problem involves simulating a clock signal with varying frequencies and tracking the state of a signal-strength variable. TensorFlow's ability to handle complex data manipulations, control structures, and its @tf.function decorator for efficient execution makes it a fitting choice for tackling this problem. By utilizing TensorFlow's features such as Dataset transformations, efficient filtering, and tensor operations, we can create a clean and efficient solution to this intriguing puzzle.