Advent of Code 2021 in pure TensorFlow - day 1

Even if it’s a bit late* I decided to start solving all the puzzles of the Advent of Code (AoC) using TensorFlow.

One may be wondering why solving the AoC puzzles using TensorFlow and what it means to solve them with TensorFlow. First of all, it can be a nice challenge for myself, to see if I’m able to design pure TensorFlow solutions for these challenges. But perhaps more important than the personal challenge, is the demonstration of the power of the framework. In fact, solving a coding puzzle with TensorFlow doesn’t mean throwing fancy machine learning stuff (without any reason) to the problem to solve it. On the contrary, I want to demonstrate the flexibility - and the limitations - of the framework, showing that TensorFlow can be used to solve any kind of problem and that the produced solutions have tons of advantages with respect to the solutions developed using any other programming languages. In fact, I see TensorFlow more as a programming language than as a mere framework. This is a strong statement, but it’s justified (IMHO) by the different way of reasoning one must follow when designing pure-TensorFlow programs.

TensorFlow programs are self-contained descriptions of computation. The inference of a trained machine learning model is a TensorFlow program, but the framework is so flexible that allows describing - and exporting! - generic programs.

I’ll try to write a short article for every problem I solve using TensorFlow (so this is the beginning of a series, I hope!), highlighting the peculiarities of the provided solutions and explaining how to reason when creating TensorFlow programs.

*Today (12 Dec 2021) is the 11th day of the Advent of Code in my timezone (UTC+1).

Advent of Code

To give a bit of context, let’s recap what AoC is. From the about section:

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like.

This is pretty much what we need to know. Every year, from the 1st to the 25th of December we do have a puzzle to solve. As I said, I’m a bit late to the party since I decided to start this AoC journey only today, but after all, every cloud has a silver lining. In fact, I can safely stream all the solutions I write without ruining the challenge to anyone, since the various daily leaderboards are all full.

Day 1: Sonar Sweep: part one

You can click on the title above to read the full text of the puzzle. The TLDR version is:

You are given with a set of numbers like


the puzzle goal is count the number of times there’s an increase. In the example above, the changes are

199 (N/A - no previous measurement)
200 (increased)
208 (increased)
210 (increased)
200 (decreased)
207 (increased)
240 (increased)
269 (increased)
260 (decreased)
263 (increased)

so the answer to the puzzle is 7.

We have all the information required to solve the task. On the AoC page, we can download the real input, that’s a text file with the same format presented above, that we can put next to the TensorFlow program we are implementing.

Design phase

The nature of this task is sequential. We should read the file line by line, convert the line into a number, and compare the i-th number with the (i-1)th. If there’s an increase, add one to the counter.

Hence we need a counter, and another variable to keep track of the (i-1)th number, while we process the i-th. Having variables means that our TensorFlow program needs to create a state. Every time the word state occurs during the design phase of a TensorFlow program, this article should come to mind.

A state is nothing but a tf.Variable, but a tf.Variable in a TensorFlow program (that, I want to stress to the reader, is a description of the computation) is a node in a graph, that should be declared once. A tf.Variable behaves like a Python variable when in eager mode, but the TensorFlow programs are (and must) be executed in graph mode, and the tf.Variable are nodes that can be declared once and repeatedly used.

Input pipeline

Being sequential, we can use all the input pipelines offered by TensorFlow to efficiently read the data, process, and loop over it.

dataset ="input").map(
    lambda string: tf.strings.to_number(string, out_type=tf.int64)

Straightforward. We read the “input” file using the TextLineDataset object. This specialization automatically creates a new element for every new line in the file. Through the map method, we apply the conversion from tf.string to tf.int64.

TensorFlow, differently from Python, is strictly statically typed. Every operation should be performed over the same types, even implicit conversions are not allowed, hence casts and type conversions must be widely used.

Counting increments

class IncreasesCounter(tf.Module):
    """Stateful counter. Counts the number of "increases"."""

    def __init__(self):
        self._count = tf.Variable(0, trainable=False, dtype=tf.int64)
        self._prev = tf.Variable(0, trainable=False, dtype=tf.int64)

    def __call__(self, dataset: -> tf.Tensor:
            dataset: the dataset containing the ordered sequence of numbers
                     to process.
            The number of increases. tf.Tensor, dtype=tf.int64
        for number in dataset.skip(1):
            if tf.greater(number, self._prev):
        return self._count

The IncreaseCounter class is a complete TensorFlow program. In the init we declare and initialize the status variables, and in the __call__ method we implement the logic required by the puzzle. Note that to be a TensorFlow program, the method must be decorated with @tf.function.


  1. The assignment must be performed using the assign method. Using the = operator will overwrite the Python variable, and not perform the assignment operation in the graph!
  2. All the comparisons like > are better written using their TensorFlow equivalent (e.g tf.greater). Autograph can convert them (you could write >), but it’s less idiomatic and I recommend to do not relying upon the automatic conversion, for having full control.
  3. Extracting the first element from the dataset it’s a bit “strange” since it requires to
    • Create a dataset with a single element .take(1)
    • Create an iterator from the dataset object (iter)
    • Call next over the iterator to extract the element.
  4. To skip the element assigned in the self._prev variable before the loop execution, we need to create another dataset that starts from the second element, by calling skip(1) on the dataset object.


counter = IncreasesCounter()
increases = counter(dataset)
tf.print("[part one] increases: ", increases)

Just create an instance of the IncreaseCounter and call it over the dataset previously created. Note that we do use tf.print and not print. tf.print is the operation to use, because it works also in graph mode, while print is executed only during the tracing phase (and in eager mode, which we don’t want to use).

The execution gives the correct result :) and this brings us to part 2.

Day 1: Sonar Sweep: part two

TLDR: instead of considering the single values, consider a tree-numbers sliding window. The example above now becomes:

199  A
200  A B
208  A B C
210    B C D
200  E   C D
207  E F   D
240  E F G
269    F G H
260      G H
263        H

Note: the input doesn’t change, these A B C, and so on are here only to visualize the sliding windows. The goal is to sum all the numbers in a window (e.g. 199+200+208 for windows A) and compare the sum with the sum of the previous sliding window.

A: 607 (N/A - no previous sum)
B: 618 (increased)
C: 618 (no change)
D: 617 (decreased)
E: 647 (increased)
F: 716 (increased)
G: 769 (increased)
H: 792 (increased)

In this case the answer is 5.

Design phase - part two

We already have a TensorFlow program that can detect increases of adjacent numbers in a dataset. So, we can just feed to the program a different output to get the correct result.

TensorFlow offers us several functions for working with datasets. As we’ve seen, it is possible to skip elements and create a new dataset, apply transformation function with map, and so on. Thus, we can solve this challenge by creating a dataset of the resulting sums of the various sliding windows.

Input pipeline - part two

The idea is to create 3 different datasets by shifting by 1 element every time. Create batches of 3 elements (the sliding windows), and them sum all the values in the batch (window).

The 3 datasets can be merged interleaving the values, using the order 1,2,3, 1,2,3 and so on. This order means: pick the i-th element from the first dataset, then pick the i-th element from the second dataset, then pick the i-th element from the third dataset, then increment i. Repeat until all the dataset consumed all the elements.

datasets = [dataset, dataset.skip(1), dataset.skip(2)]
for idx, dataset in enumerate(datasets):
    datasets[idx] = dataset.batch(3, drop_remainder=True).map(tf.reduce_sum)

interleaved_dataset =

Execution - part two

The instance counter of IncreaseCounter has a state, hance we can’t re-use it because we haven’t added a method to reset the state. Thus, we need to create a new instance and pass the interleaved_dataset to get the correct result.

counter = IncreasesCounter()
increases = counter(interleaved_dataset)
tf.print("[part two] increases: ", increases)

The first puzzle is completely solved :)


Solving the AoC puzzles with TensorFlow can be not just fun (come on, it’s a nice challenge thinking about all these nuances :)), but it can be also a good way to design very efficient solutions. In fact, I still haven’t spent some word about the advantages of these implementations, but there are many.

  • The solution can run on any hardware. If you have a supported GPU, it runs on it.
  • Any operation that can be executed in parallel, is automatically parallelized by the framework.
  • These solutions are language agnostic. Yes, we designed and executed them in Python, but we could export them as SavedModel, and re-use the same logic in any other programming language since all we need is the TensorFlow C runtime. For example, a SavedModel of this program can be executed in Go using tfgo.

I’m doing this for fun (and I’m having fun, really), so expect another article for day 2 coming soon!

For any feedback or comment, please use the Disqus form below - thanks!

PS: I’m posting all the solutions also on GitHub, you can find them here:

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

Related Posts

Integrating third-party libraries as Unreal Engine plugins: ABI compatibility and Linux toolchain

The Unreal Build Tool (UBT) official documentation explains how to integrate a third-party library into Unreal Engine projects in a very broad way without focusing on the real problems that are (very) likely to occur while integrating the library. In particular, when the third-party library is a pre-built binary there are low-level details that must be known and that are likely to cause troubles during the integration - or even make it impossible!

Code Coverage of Unreal Engine projects

Code coverage is a widely used metric that measures the percentage of lines of code covered by automated tests. Unreal Engine doesn't come with out-of-the-box support for computing this metric, although it provides a quite good testing suite. In this article, we dive into the Unreal Build Tool (UBT) - particularly in the Linux Tool Chain - to understand what has to be modified to add the support, UBT-side, for the code coverage. Moreover, we'll show how to correctly use the lcov tool for generating the code coverage report.

Wrap up of Advent of Code 2021 in pure TensorFlow

A wrap up of my solutions to the Advent of Code 2021 puzzles in pure TensorFlow

Advent of Code 2021 in pure TensorFlow - day 12

Day 12 problem projects us the world of graphs. TensorFlow can be used to work on graphs pretty easily since a graph can be represented as an adjacency matrix, and thus, we can have a tf.Tensor containing our graph. However, the "natural" way of exploring a graph is using recursion, and as we'll see in this article, this prevents us to solve the problem using a pure TensorFlow program, but we have to work only in eager mode.

Advent of Code 2021 in pure TensorFlow - day 11

The Day 11 problem has lots in common with Day 9. In fact, will re-use some computer vision concepts like the pixel neighborhood, and we'll be able to solve both parts in pure TensorFlow by using only a tf.queue as a support data structure.

Advent of Code 2021 in pure TensorFlow - day 10

The day 10 challenge projects us in the world of syntax checkers and autocomplete tools. In this article, we'll see how TensorFlow can be used as a generic programming language for implementing a toy syntax checker and autocomplete.

Advent of Code 2021 in pure TensorFlow - day 9

The day 9 challenge can be seen as a computer vision problem. TensorFlow contains some computer vision utilities that we'll use - like the image gradient - but it's not a complete framework for computer vision (like OpenCV). Anyway, the framework offers primitive data types like tf.TensorArray and tf.queue that we can use for implementing a flood-fill algorithm in pure TensorFlow and solve the problem.

Advent of Code 2021 in pure TensorFlow - day 8

The day 8 challenge is, so far, the most boring challenge faced 😅. Designing a TensorFlow program - hence reasoning in graph mode - would have been too complicated since the solution requires lots of conditional branches. A known AutoGraph limitation forbids variables to be defined in only one branch of a TensorFlow conditional if the variable is used afterward. That's why the solution is in pure TensorFlow eager.

Advent of Code 2021 in pure TensorFlow - day 7

The day 7 challenge is easily solvable with the help of the TensorFlow ragged tensors. In this article, we'll solve the puzzle while learning what ragged tensors are and how to use them.

Advent of Code 2021 in pure TensorFlow - day 6

The day 6 challenge has been the first one that obliged me to completely redesign for part 2 the solution I developed for part 1. For this reason, in this article, we'll see two different approaches to the problem. The former will be computationally inefficient but will completely model the problem, hence it will be easy to understand. The latter, instead, will be completely different and it will focus on the puzzle goal instead of the complete modeling.