Even if it’s a bit late* I decided to start solving all the puzzles of the Advent of Code (AoC) using TensorFlow.
One may be wondering why solving the AoC puzzles using TensorFlow and what it means to solve them with TensorFlow. First of all, it can be a nice challenge for myself, to see if I’m able to design pure TensorFlow solutions for these challenges. But perhaps more important than the personal challenge, is the demonstration of the power of the framework. In fact, solving a coding puzzle with TensorFlow doesn’t mean throwing fancy machine learning stuff (without any reason) to the problem to solve it. On the contrary, I want to demonstrate the flexibility - and the limitations - of the framework, showing that TensorFlow can be used to solve any kind of problem and that the produced solutions have tons of advantages with respect to the solutions developed using any other programming languages. In fact, I see TensorFlow more as a programming language than as a mere framework. This is a strong statement, but it’s justified (IMHO) by the different way of reasoning one must follow when designing pure-TensorFlow programs.
TensorFlow programs are self-contained descriptions of computation. The inference of a trained machine learning model is a TensorFlow program, but the framework is so flexible that allows describing - and exporting! - generic programs.
I’ll try to write a short article for every problem I solve using TensorFlow (so this is the beginning of a series, I hope!), highlighting the peculiarities of the provided solutions and explaining how to reason when creating TensorFlow programs.
*Today (12 Dec 2021) is the 11th day of the Advent of Code in my timezone (UTC+1).
Advent of Code
To give a bit of context, let’s recap what AoC is. From the about section:
Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like.
This is pretty much what we need to know. Every year, from the 1st to the 25th of December we do have a puzzle to solve. As I said, I’m a bit late to the party since I decided to start this AoC journey only today, but after all, every cloud has a silver lining. In fact, I can safely stream all the solutions I write without ruining the challenge to anyone, since the various daily leaderboards are all full.
Day 1: Sonar Sweep: part one
You can click on the title above to read the full text of the puzzle. The TLDR version is:
You are given with a set of numbers like
199 200 208 210 200 207 240 269 260 263
the puzzle goal is count the number of times there’s an increase. In the example above, the changes are
199 (N/A - no previous measurement) 200 (increased) 208 (increased) 210 (increased) 200 (decreased) 207 (increased) 240 (increased) 269 (increased) 260 (decreased) 263 (increased)
so the answer to the puzzle is 7.
We have all the information required to solve the task. On the AoC page, we can download the real input, that’s a text file with the same format presented above, that we can put next to the TensorFlow program we are implementing.
The nature of this task is sequential. We should read the file line by line, convert the line into a number, and compare the i-th number with the (i-1)th. If there’s an increase, add one to the counter.
Hence we need a counter, and another variable to keep track of the (i-1)th number, while we process the i-th. Having variables means that our TensorFlow program needs to create a state. Every time the word state occurs during the design phase of a TensorFlow program, this article should come to mind.
A state is nothing but a
tf.Variable, but a
tf.Variable in a TensorFlow program (that, I want to stress to the reader, is a description of the computation) is a node in a graph, that should be declared once. A
tf.Variable behaves like a Python variable when in eager mode, but the TensorFlow programs are (and must) be executed in graph mode, and the
tf.Variable are nodes that can be declared once and repeatedly used.
Being sequential, we can use all the input pipelines offered by TensorFlow to efficiently read the data, process, and loop over it.
dataset = tf.data.TextLineDataset("input").map( lambda string: tf.strings.to_number(string, out_type=tf.int64) )
Straightforward. We read the “input” file using the
TextLineDataset object. This
tf.data.Dataset specialization automatically creates a new element for every new line in the file. Through the
map method, we apply the conversion from
TensorFlow, differently from Python, is strictly statically typed. Every operation should be performed over the same types, even implicit conversions are not allowed, hence casts and type conversions must be widely used.
class IncreasesCounter(tf.Module): """Stateful counter. Counts the number of "increases".""" def __init__(self): self._count = tf.Variable(0, trainable=False, dtype=tf.int64) self._prev = tf.Variable(0, trainable=False, dtype=tf.int64) @tf.function def __call__(self, dataset: tf.data.Dataset) -> tf.Tensor: """ Args: dataset: the dataset containing the ordered sequence of numbers to process. Returns: The number of increases. tf.Tensor, dtype=tf.int64 """ self._prev.assign(next(iter(dataset.take(1)))) for number in dataset.skip(1): if tf.greater(number, self._prev): self._count.assign_add(1) self._prev.assign(number) return self._count
IncreaseCounter class is a complete TensorFlow program. In the
init we declare and initialize the status variables, and in the
__call__ method we implement the logic required by the puzzle. Note that to be a TensorFlow program, the method must be decorated with
- The assignment must be performed using the
assignmethod. Using the
=operator will overwrite the Python variable, and not perform the assignment operation in the graph!
- All the comparisons like
>are better written using their TensorFlow equivalent (e.g
tf.greater). Autograph can convert them (you could write
>), but it’s less idiomatic and I recommend to do not relying upon the automatic conversion, for having full control.
- Extracting the first element from the dataset it’s a bit “strange” since it requires to
- Create a dataset with a single element
- Create an iterator from the dataset object (
nextover the iterator to extract the element.
- Create a dataset with a single element
- To skip the element assigned in the
self._prevvariable before the loop execution, we need to create another dataset that starts from the second element, by calling
skip(1)on the dataset object.
counter = IncreasesCounter() increases = counter(dataset) tf.print("[part one] increases: ", increases)
Just create an instance of the
IncreaseCounter and call it over the dataset previously created. Note that we do use
tf.print and not
tf.print is the operation to use, because it works also in graph mode, while
The execution gives the correct result :) and this brings us to part 2.
Day 1: Sonar Sweep: part two
TLDR: instead of considering the single values, consider a tree-numbers sliding window. The example above now becomes:
199 A 200 A B 208 A B C 210 B C D 200 E C D 207 E F D 240 E F G 269 F G H 260 G H 263 H
Note: the input doesn’t change, these A B C, and so on are here only to visualize the sliding windows. The goal is to sum all the numbers in a window (e.g. 199+200+208 for windows
A) and compare the sum with the sum of the previous sliding window.
A: 607 (N/A - no previous sum) B: 618 (increased) C: 618 (no change) D: 617 (decreased) E: 647 (increased) F: 716 (increased) G: 769 (increased) H: 792 (increased)
In this case the answer is 5.
Design phase - part two
We already have a TensorFlow program that can detect increases of adjacent numbers in a dataset. So, we can just feed to the program a different output to get the correct result.
TensorFlow offers us several functions for working with datasets. As we’ve seen, it is possible to skip elements and create a new dataset, apply transformation function with
map, and so on. Thus, we can solve this challenge by creating a dataset of the resulting sums of the various sliding windows.
Input pipeline - part two
The idea is to create 3 different datasets by shifting by 1 element every time. Create batches of 3 elements (the sliding windows), and them sum all the values in the batch (window).
The 3 datasets can be merged interleaving the values, using the order
1,2,3 and so on. This order means: pick the i-th element from the first dataset, then pick the i-th element from the second dataset, then pick the i-th element from the third dataset, then increment i. Repeat until all the dataset consumed all the elements.
datasets = [dataset, dataset.skip(1), dataset.skip(2)] for idx, dataset in enumerate(datasets): datasets[idx] = dataset.batch(3, drop_remainder=True).map(tf.reduce_sum) interleaved_dataset = tf.data.Dataset.choose_from_datasets( datasets, tf.data.Dataset.range(3).repeat() )
Execution - part two
IncreaseCounter has a state, hance we can’t re-use it because we haven’t added a method to reset the state. Thus, we need to create a new instance and pass the
interleaved_dataset to get the correct result.
counter = IncreasesCounter() increases = counter(interleaved_dataset) tf.print("[part two] increases: ", increases)
The first puzzle is completely solved :)
Solving the AoC puzzles with TensorFlow can be not just fun (come on, it’s a nice challenge thinking about all these nuances :)), but it can be also a good way to design very efficient solutions. In fact, I still haven’t spent some word about the advantages of these implementations, but there are many.
- The solution can run on any hardware. If you have a supported GPU, it runs on it.
- Any operation that can be executed in parallel, is automatically parallelized by the framework.
- These solutions are language agnostic. Yes, we designed and executed them in Python, but we could export them as SavedModel, and re-use the same logic in any other programming language since all we need is the TensorFlow C runtime. For example, a SavedModel of this program can be executed in Go using tfgo.
I’m doing this for fun (and I’m having fun, really), so expect another article for day 2 coming soon!
For any feedback or comment, please use the Disqus form below - thanks!
PS: I’m posting all the solutions also on GitHub, you can find them here: https://github.com/galeone/tf-aoc.