Analyzing tf.function to discover AutoGraph strengths and subtleties - part 3


In part 1 we learned how to convert a TensorFlow 1.x code to its eager version, the eager version to its graph representation, and faced the problems that arise when working with functions that create a state.

In part 2 we learned that tf.function creates a new graph for every different input value if the input is not a tf.Tensor object but a Python native type and how this could slow down (or speed up if correctly used) the execution. Moreover, the differences between the tf.autograph generated source code and what happens, instead, when using AutoGraph trough tf.function have been highlighted.

In this third and last part, we analyze what happens when tf.function is used to convert a function that contains “complex” Python constructs in its body. Should we design functions thinking about how they are going to be converted?

AutoGraph capabilities and limitations

In the TensorFlow repository, in the python/autograph folder, we can find a document that explains which are the capabilities and the limitations of the AutoGraph module together with a list of the Python constructs it is able to convert.

The table in the section “Python Language Support Status” contains all the Python constructs that AutoGraph explicitly supports, plan to support, or won’t support. Among them, we can find the widely used while, for, if statements, the Python built-in print, len, range, and the iterator construct.

In the next sections, various Python functions that use these Python constructs are analyzed, to understand if the function body gets converted as we expect or if it is required to design the functions thinking about the graph conversion.

if … else

Here’s the function we are going to analyze:

@tf.function
def if_else(a, b):
  if a > b:
    tf.print("a > b", a, b)
  else:
    tf.print("a <= b", a, b)

It’s trivial: when a is greater than b then it prints a > b followed by the value of a and b; otherwise it prints a <= b and their value.

Step 1: graph conversion

As seen in the previous articles, the tf.autograph package can be used to inspect the result of the graph conversion.

print(tf.autograph.to_code(if_else.python_function))

The generated code is:

def tf__if_else(a, b):
    cond = a > b

    def get_state():
        return ()

    def set_state(_):
        pass

    def if_true():
        ag__.converted_call(
            "print",
            tf,
            ag__.ConversionOptions(
                recursive=True,
                force_conversion=False,
                optional_features=(),
                internal_convert_user_code=True,
            ),
            ("a > b", a, b),
            None,
        )
        return ag__.match_staging_level(1, cond)

    def if_false():
        ag__.converted_call(
            "print",
            tf,
            ag__.ConversionOptions(
                recursive=True,
                force_conversion=False,
                optional_features=(),
                internal_convert_user_code=True,
            ),
            ("a <= b", a, b),
            None,
        )
        return ag__.match_staging_level(1, cond)

    ag__.if_stmt(cond, if_true, if_false, get_state, set_state)

The conversion is trivial too: the if_stmt maps, more or less, with the tf.cond function; the first parameter is the condition to check, the second is the branch to take when the condition is True, the third the branch to take otherwise. The get_state and set_state methods basically do nothing and we can safely ignore them.

Step 2: execution

As seen in part 2 tf.function by design does not do the boxing of the Python native types; therefore we use a tf.Tensor produced by a tf.constant operation as input.

x = tf.constant(1)
if_else(x, x)

As expected, the output is: a <= b 1 1.

if … elif … else

Let’s change the function a little bit, adding an elif statement. The function now is:

@tf.function
def if_elif(a, b):
  if a > b:
    tf.print("a > b", a, b)
  elif a == b:
    tf.print("a == b", a, b)
  else:
    tf.print("a < b", a, b)

Step 1: graph conversion

The generated function, with the removed tf.print conversion and (get|set)\_state function definitions, is

def tf__if_elif(a, b):
    cond_1 = a > b

    def if_true_1():
        # tf.print("a > b", a, b)
        return ag__.match_staging_level(1, cond_1)

    def if_false_1():
        cond = a == b

        def if_true():
            # tf.print(a == b, a, b)
            return ag__.match_staging_level(1, cond)

        def if_false():
            # tf.print(a < b, a,b)
            return ag__.match_staging_level(1, cond)

        ag__.if_stmt(cond, if_true, if_false, get_state, set_state)
        return ag__.match_staging_level(1, cond_1)

    ag__.if_stmt(cond_1, if_true_1, if_false_1, get_state_1, set_state_1)

The conversion seems correct: two tf.cond nested. The inner tf.cond is defined inside the false branch of the outer one. The outer tf.cond checks if a > b, and if it is True then it prints a > b, otherwise executes the if_false_1 branch that contains the inner tf.cond.

The inner tf.cond has the equality condition cond = a == b to verify; if it holds, it prints a == b, otherwise it prints a < b.

Step 2: execution

x = tf.constant(1)
if_elif(x, x)

Executing it, we expect to see a == b, 1, 1 since this is the truth. However, the output is a < b 1 1. WHAT?

OK then, debug time.


Update (14 Sept 2019): as Raphael Meudec pointed out in the tweet below, this behavior has been changed in TensorFlow 2.0-rc0 and it works as expected. However, the lessons presented later in the article are still valid: following them helps you writing idiomatic TensorFlow 2.0 code.


Step 3: debugging

The AutoGraph representation looks correct. Moreover, we can try by using the non-converted function to see if everything goes as expected in eager mode.

x = tf.constant(1)
if_elif.python_function(x, x)

In eager mode the output is correct: a == b 1 1. So we do expect to see the same output when we feed the function with two tf.Tensor objects that hold the same value

x, y = tf.constant(1), tf.constant(1)
if_elif.python_function(x, y)

Surprise! The output is a < b 1 1. What’s going on?

Lesson 1: not all operators are created equal

This lesson is not about AutoGraph or tf.function but is about tf.Tensor.

This “weird” behavior that also happens when the eager mode is enabled is due to the different way the __eq__ operator for the tf.Tensor objects have been overridden.

There is a question on StackOverflow and a related Github issue about this. In short: the __eq__ operator has been overridden, but the operator does not use tf.equal to check for the Tensor equality, it just checks for the Python variable identity (if you are familiar with the Java programming language, this is precisely like the == operator used on string objects). The reason is that the tf.Tensor object needs to be hashable since it is used everywhere in the TensorFlow codebase as key for dict objects.

OK then, to solve it is required to do not rely upon the __eq__ operator but use tf.equal to check if the equality holds.

However, something should still sound strange: why when invoking the graph-converted function, passing the same tf.Tensor x, the execution produces a < b 1 1 instead of a == b 1 1 as it happens in eager execution?

Lesson 2: how AutoGraph (don’t) converts the operators

So far we supposed that AutoGraph is able to translate not only the if, elif, and else statements to the graph equivalent, but also the Python built-in operators like __eq__, __gt__, and __lt__. In practice, this conversion (still?) does not happen at all.

In the previously converted graph-code, the two condititions are expressed as a > b and a == b and not as function calls to AutoGraph converted functions (ag__.converted_call(...)).

In practice, what happens is that the cond is always False. We can verify this assertion by adding an additional elif to the previous function and calling it again.

@tf.function
def if_elif(a, b):
  if a > b:
    tf.print("a > b", a, b)
  elif a == b:
    tf.print("a == b", a, b)
  elif a < b:
    tf.print("a < b", a, b)
  else:
    tf.print("wat")
x = tf.constant(1)
if_elif(x,x)

Output: wat.

Hurray?

Lesson 3: how to write a function

To have the very same behavior in both eager and graph execution we have to know that:

  1. The semantic of the operations matters.
  2. There are operators that have been overridden following a different semantic (respect to the most natural one, common in Python).
  3. AutoGraph converts Python statements naturally (if, elif, …) but it requires some extra care when designing a function that is going to be tf.function decorated.

In practice, and this is the most important lesson, use the TensorFlow operators explicitly everywhere (in the end, the Graph is still present, and we are building it!).

Thus, we can write the correctly eager and graph-convertible function by using the correct tf. methods.

@tf.function
def if_elif(a, b):
  if tf.math.greater(a, b):
    tf.print("a > b", a, b)
  elif tf.math.equal(a, b):
    tf.print("a == b", a, b)
  elif tf.math.less(a, b):
    tf.print("a < b", a, b)
  else:
    tf.print("wat")

The generated graph code now looks like (removed long parts for clarity):

def tf__if_elif(a, b):
    cond_2 = ag__.converted_call("greater", ...)  # a > b

    def if_true_2():
        ag__.converted_call("print", ...)  # tf.print a > b
        return ag__.match_staging_level(1, cond_2)

    def if_false_2():
        cond_1 = ag__.converted_call("equal", ...)  # tf.math.equal

        def if_true_1():
            ag__.converted_call("print", ...)  # tf.print a == b
            return ag__.match_staging_level(1, cond_1)

        def if_false_1():
            cond = ag__.converted_call("less", ...)  # a < b

            def if_true():
                ag__.converted_call("print", ...)  # tf.print a < b
                return ag__.match_staging_level(1, cond)

            def if_false():
                ag__.converted_call("print", ...)  # tf.print wat
                return ag__.match_staging_level(1, cond)

            ag__.if_stmt(cond, if_true, if_false, get_state, set_state)
            return ag__.match_staging_level(1, cond_1)

        ag__.if_stmt(cond_1, if_true_1, if_false_1, get_state_1, set_state_1)
        return ag__.match_staging_level(1, cond_2)

    ag__.if_stmt(cond_2, if_true_2, if_false_2, get_state_2, set_state_2)

Now that every single part of the function has been converted (note the ag__converted_call everywhere) the function works as we want, also when it is converted to its graph representation.

for … in range

Following the previous 3 lessons, writing a function that uses a for loop is trivial. To be entirely sure that the code is correctly graph-converted, we can design the function by using the tensorflow tf. methods to help the conversion. So, for a simple function that sums the number from 1 to X-1 the correct way of designing it is to use:

  1. An external tf.Variable since the function creates a state and from part 1 we know how to deal with it.
  2. Use tf.range instead of range since tf.range exists and therefore it is just better to use it.
x = tf.Variable(1)
@tf.function
def test_for(upto):
  for i in range(upto):
    x.assign_add(i)

x.assign(tf.constant(0))
test_for(tf.constant(5))
print("x value: ", x.numpy())

The value of the x variable is 10, as expected.

The reader is invited to convert the function to its graph representation and check if every statement has been correctly converted.

Question (please feel free to answer in the comment section!): what happens if the line x.assign_add(1) is replaced by x = x + i?

Conclusions

Writing functions that work correctly in both eager mode and their graph-converted representation requires to know some subtleties that in this three-part particle have been highlighted. To summarize them:

  • Functions that create a state need a dedicated design since in eager mode they just work while when converted the stateful objects can create problems. (part 1)
  • AutoGraph does not perform the boxing of the Python native type, and this can slow down the execution a lot (part 2); use tf.Tensor whenever possible!
  • tf.print and print are different objects; there is a clear distinction between the first call (AutoGraph + function execution + tracing) and any other call of the graph-converted function (part 2).
  • The operator overloading of tf.Tensor has its own peculiarities. In order to be 100% confident of your function design, and making it also work when it is graph-converted, I highly recommend to use the TensorFlow operators explicitly (call tf.equal(a,b) instead of a == b and so on).

Announcement

The article is finished, but I hope to say something pleasing by announcing that I’m authoring my first book about TensorFlow 2.0 and Neural Networks!

Hands-On Neural Networks with TensorFlow 2.0

Understand TensorFlow, from static graph to eager execution, and design neural networks

The book is divided into two parts: the first part is more theoretical and is about machine learning and neural networks, with a focus on the intuitive idea behind the presented concepts. The second part, that’s the main topic of the book, is about the TensorFlow architecture (from 1.x to 2.0) followed by the implementation of several neural-networks-based solutions to challenging machine learning problems, all using TensorFlow 2.0.

If you want to receive an email when the book is out and also stay up-to-date with the latest articles, leave your email in the form below!

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

Related Posts

Fixing the code signing and notarization issues of Unreal Engine (5.3+) projects

Starting from Unreal Engine 5.3, Epic Games added support for the so-called modern Xcode workflow. This workflow allows the Unreal Build Tool (UBT) to be more consistent with the standard Xcode app projects, and to be compliant with the Apple requirements for distributing applications... In theory! 😅 In practice this workflow is flawed: both the code signing and the framework supports are not correctly implemented, making the creation of working apps and their distribution impossible. In this article, we'll go through the problems faced during the packaging, code signing, and notarization of an Unreal Engine application on macOS and end up with the step-by-step process to solve them all.

The (Hidden?) Costs of Vertex AI Resource Pools: A Cautionary Tale

In the article "Custom model training & deployment on Google Cloud using Vertex AI in Go" we explored how to leverage Go to create a resource pool and train a machine learning model using Vertex AI's allocated resources. While this approach offers flexibility, there's a crucial aspect to consider: the cost implications of resource pools. This article details my experience with a sudden price increase in Vertex AI and the hidden culprit – a seemingly innocuous resource pool.

Building a RAG for tabular data in Go with PostgreSQL & Gemini

In this article we explore how to combine a large language model (LLM) with a relational database to allow users to ask questions about their data in a natural way. It demonstrates a Retrieval-Augmented Generation (RAG) system built with Go that utilizes PostgreSQL and pgvector for data storage and retrieval. The provided code showcases the core functionalities. This is an overview of how the "chat with your data" feature of fitsleepinsights.app is being developed.

Using Gemini in a Go application: limits and details

This article explores using Gemini within Go applications via Vertex AI. We'll delve into the limitations encountered, including the model's context window size and regional restrictions. We'll also explore various methods for feeding data to Gemini, highlighting the challenges faced due to these limitations. Finally, we'll briefly introduce RAG (Retrieval-Augmented Generation) as a potential solution, but leave its implementation details for future exploration.

Custom model training & deployment on Google Cloud using Vertex AI in Go

This article shows a different approach to solving the same problem presented in the article AutoML pipeline for tabular data on VertexAI in Go. This time, instead of relying on AutoML we will define the model and the training job ourselves. This is a more advanced usage that allows the experienced machine learning practitioner to have full control on the pipeline from the model definition to the hardware to use for training and deploying. At the end of the article, we will also see how to use the deployed model. All of this, in Go and with the help of Python and Docker for the custom training job definition.

Integrating third-party libraries as Unreal Engine plugins: solving the ABI compatibility issues on Linux when the source code is available

In this article, we will discuss the challenges and potential issues that may arise during the integration process of a third-party library when the source code is available. It will provide guidance on how to handle the compilation and linking of the third-party library, manage dependencies, and resolve compatibility issues. We'll realize a plugin for redis plus plus as a real use case scenario, and we'll see how tough can it be to correctly compile the library for Unreal Engine - we'll solve every problem step by step.

AutoML pipeline for tabular data on VertexAI in Go

In this article, we delve into the development and deployment of tabular models using VertexAI and AutoML with Go, showcasing the actual Go code and sharing insights gained through trial & error and extensive Google research to overcome documentation limitations.

Advent of Code 2022 in pure TensorFlow - Day 12

Solving problem 12 of the AoC 2022 in pure TensorFlow is a great exercise in graph theory and more specifically in using the Breadth-First Search (BFS) algorithm. This problem requires working with a grid of characters representing a graph, and the BFS algorithm allows us to traverse the graph in the most efficient way to solve the problem.

Advent of Code 2022 in pure TensorFlow - Day 11

In this article, we'll show how to solve problem 11 from the Advent of Code 2022 (AoC 2022) using TensorFlow. We'll first introduce the problem and then provide a detailed explanation of our TensorFlow solution. The problem at hand revolves around the interactions of multiple monkeys inspecting items, making decisions based on their worry levels, and following a set of rules.

Advent of Code 2022 in pure TensorFlow - Day 10

Solving problem 10 of the AoC 2022 in pure TensorFlow is an interesting challenge. This problem involves simulating a clock signal with varying frequencies and tracking the state of a signal-strength variable. TensorFlow's ability to handle complex data manipulations, control structures, and its @tf.function decorator for efficient execution makes it a fitting choice for tackling this problem. By utilizing TensorFlow's features such as Dataset transformations, efficient filtering, and tensor operations, we can create a clean and efficient solution to this intriguing puzzle.