Understanding Tensorflow using Go

Tensorflow is not a Machine Learning specific library, instead, is a general purpose computation library that represents computations with graphs. Its core is implemented in C++ and there are also bindings for different languages. The bindings for the Go programming language, differently from the Python ones, are a useful tool not only for using Tensorflow in Go but also for understanding how Tensorflow is implemented under the hood.

The bindings

Officially, the Tensorflow’s developers released:

  • The C++ source code: the real Tensorflow core where the high & low level operations are concretely implemented.
  • The Python bindings & the Python library: the bindings are automatically generated from the C++ implementation, in this way we can use Python to invoke C++ functions: that’s how, for instance, the core of numpy is implemented. The library, moreover, combines calls to the bindings in order to define the higher level API that everyone’s using Tensorflow knows well.
  • The Java bindings
  • The Go binding

Being a Gopher and not a Java lover, I started looking at the Go bindings in order to understand what kind of tasks they were created for.

The Go bindings

Tensorflow & Go

The Gopher (created by Takuya Ueda (@tenntenn). Licensed under the Creative Commons 3.0 Attributions license)
overlapping the Tensorflow Logo.

The first thing to note is that the Go API, for admission of the maintainers itself, lacks the Variable support: this API is designed to use trained models and not for training models from scratch. This is clearly stated in the Installing Tensorflow for Go:

TensorFlow provides APIs for use in Go programs. These APIs are particularly well-suited to loading models created in Python and executing them within a Go application.

If we’re not interested in training ML models: hooray! If, instead, you’re interested in training models here’s an advice:

Be a real gopher, keep it simple! Use Python to define & train models; you can always load trained models and using them with Go later!

In short: the go bindings can be used to import and define constants graphs; where constant, in this context, means that there’s no training process involved and thus no trainable variables.

Let’s now start diving into Tensorflow using Go: let’s create our first application.

In the following, I suppose that the reader has its Go environment ready and the Tensorflow bindings compiled and installed as explained in the README.

Understand Tensorflow structure

Let’s repeat what Tensorflow is (kept from the Tensorflow website, the emphasis is mine):

TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them.

We can think of Tensorflow as a descriptive language, a bit like SQL, in which you describe what you want and let the underlying engine (the database) parse your query, check for syntactic and semantic errors, convert it to its private representation, optimize it and compute the results: all this to give you the correct results.

Therefore, what we really do when we use any of the available APIs is to describe a graph: the evaluation of the graph starts when we place it into a Session and explicitly decide to Run the graph within the Session.

Knowing this, let’s try to define a computational graph and evaluate it within a Session. The API documentation gives us a pretty clear list of the available methods within the packages tensorflow (shorthanded tf) & op.

As we can see, these two packages contains everything we need to define and evaluate a graph.

The former contains the functions to construct the basic “empty” structures like the Graph itself, the latter is the most important package that contains the bindings automatically generated from the C++ implementation.

However, suppose that we want to compute the matrix multiplication between and where

I suppose that the reader is already familiar with the tensorflow graph definition idea and knows what placeholders are and how they work. The code below is the first attempt that a Tensorflow Python bindings user would make. Let’s call this file attempt1.go

package main

import (
	tf "github.com/tensorflow/tensorflow/tensorflow/go"

func main() {
	// Let's describe what we want: create the graph

	// We want to define two placeholder to fill at runtime
	// the first placeholder A will be a [2, 2] tensor of integers
	// the second placeholder x will be a [2, 1] tensor of intergers

	// Then we want to compute Y = Ax

	// Create the first node of the graph: an empty node, the root of our graph
	root := op.NewScope()

	// Define the 2 placeholders
	A := op.Placeholder(root, tf.Int64, op.PlaceholderShape(tf.MakeShape(2, 2)))
	x := op.Placeholder(root, tf.Int64, op.PlaceholderShape(tf.MakeShape(2, 1)))

	// Define the operation node that accepts A & x as inputs
	product := op.MatMul(root, A, x)

	// Every time we passed a `Scope` to an operation, we placed that
	// operation **under** that scope.
	// As you can see, we have an empty scope (created with NewScope): the empty scope
	// is the root of our graph and thus we denote it with "/".

	// Now we ask tensorflow to build the graph from our definition.
	// The concrete graph is created from the "abstract" graph we defined
	// using the combination of scope and op.

	graph, err := root.Finalize()
	if err != nil {
		// It's useless trying to handle this error in any way:
		// if we defined the graph wrongly we have to manually fix the definition.

		// It's like a SQL query: if the query is not syntactically valid
		// we have to rewrite it

	// If here: our graph is syntatically valid.
	// We can now place it within a Session and execute it.

	var sess *tf.Session
	sess, err = tf.NewSession(graph, &tf.SessionOptions{})
	if err != nil {

	// In order to use placeholders, we have to create the Tensors
	// containing the values to feed into the network
	var matrix, column *tf.Tensor

	// A = [ [1, 2], [-1, -2] ]
	if matrix, err = tf.NewTensor([2][2]int64{ {1, 2}, {-1, -2} }); err != nil {
	// x = [ [10], [100] ]
	if column, err = tf.NewTensor([2][1]int64{ {10}, {100} }); err != nil {

	var results []*tf.Tensor
	if results, err = sess.Run(map[tf.Output]*tf.Tensor{
		A: matrix,
		x: column,
	}, []tf.Output{product}, nil); err != nil {
	for _, result := range results {

The code is completely commented and I invite the reader to read every single comment.

Now, the Tensorflow-Python user expects that this code compiles and works fine. Let’s see if he’s right:

go run attempt1.go

Here’s what he got:

panic: failed to add operation "Placeholder": Duplicate node name in graph: 'Placeholder'

wait: what’s going on here? Apparently, there are 2 operations “Placeholder” with the same name “Placeholder”.

Lesson 1: node IDs

The Python API generates different nodes everytime we invoke a method to define an operation, no matter if this has been already called before. In fact, the following code returns 3 without problems.

import tensorflow as tf
a = tf.placeholder(tf.int32, shape=())
b = tf.placeholder(tf.int32, shape=())
add = tf.add(a,b)
sess = tf.InteractiveSession()
print(sess.run(add, feed_dict={a: 1,b: 2}))

We can verify that this program creates two different nodes printing the placeholder names: print(a.name, b.name) produces Placeholder:0 Placeholder_1:0. Thus, the b placeholder is Placeholder_1:0 whilst the a placeholder is Placeholder:0.

In Go, instead, the previous program fails because A and x are both called Placeholder. We can conclude that:

The Go API does not automatically generate new names every time we invoke a function that defines an operation: the operation name is thus fixed and we can’t change it.

Question time:

  • What do we have learned about the Tensorflow architecture? Every node in a graph must have a unique name. Every node is identified by its name.
  • Is the name of the node the same of the operation that defined it? Yes, or better, not completely, it’s just its last part

To clarify the second answer, let’s fix the duplicate node name problem.

Lesson 2: Scoping

As we just saw, the Python API automatically creates a new name every time an operation is defined. Under the hood, the Python API invokes the C++ method WithOpName of the class Scope. Here are the method documentation and its signature, kept from scope.h:

/// Return a new scope. All ops created within the returned scope will have
/// names of the form <name>/<op_name>[_<suffix].
Scope WithOpName(const string& op_name) const;

We can note that this method, used to name the nodes returns a Scope, thus a node name is, in reality, a Scope. A Scope is a complete path from the root / (empty graph) to the op_name.

The method WithOpName adds a suffix _<suffix> (where <suffix> is a counter) when we try to add a node that has the same path from / to op_name and thus it would be a duplicate node under the same scope.

Knowing this, to solve the issue of duplicated node names we expect to find the method WithOpName in the type Scope. Sadly, this method is not present.

Instead, looking at the documentation of the type Scope we can see that the only method that returns a new Scope is SubScope(namespace string).

Quoting the documentation:

SubScope returns a new Scope which will cause all operations added to the graph to be namespaced with ‘namespace’. If namespace collides with an existing namespace within the scope, then a suffix will be added.

The collision management using a suffix is different from the C++ WithOpName: WithOpName adds the suffix after the operation name, within the same scope (thus Placeholder becomes Placeholder_1) while the Go SubScope adds the suffix after the scope name.

This difference generates completely different graphs, but although different (nodes are placed under different scopes) they’re equivalent computationally speaking.

Let’s change the placeholder definitions in order to define two different nodes, moreover, let’s Print the Scope name.

Let’s create the file attempt2.go changing the lines

	A := op.Placeholder(root, tf.Int64, op.PlaceholderShape(tf.MakeShape(2, 2)))
	x := op.Placeholder(root, tf.Int64, op.PlaceholderShape(tf.MakeShape(2, 1)))


	// define 2 subscopes of the root subscopes, called "input". In this
	// way we expect to have a input/ and a input_1/ scope under the root scope
	A := op.Placeholder(root.SubScope("input"), tf.Int64, op.PlaceholderShape(tf.MakeShape(2, 2)))
	x := op.Placeholder(root.SubScope("input"), tf.Int64, op.PlaceholderShape(tf.MakeShape(2, 1)))
	fmt.Println(A.Op.Name(), x.Op.Name())

Usual compile & run: go run attempt2.go. Results:

input/Placeholder input_1/Placeholder

Question time:

What do we have learned about the Tensorflow architecture? A node is completely identified by the Scope in which it is defined. The scope is the path that we have to follow from the root of the graph to reach the node. There are 2 way of defining nodes that execute the same operation: define the operation in a different Scope (Go style) or change the operation name (how Python automatically do or how we can do in C++)

We solved the problem of the duplicate node names but another problem landed on our terminal.

panic: failed to add operation "MatMul": Value for attr 'T' of int64 is not in the list of allowed values: half, float, double, int32, complex64, complex128

Why the MatMul node definition panics? We just want to multiply two tf.int64 matrices! It looks like that int64 is the only type that MatMul does not accept.

Value for attr ‘T’ of int64 is not in the list of allowed values: half, float, double, int32, complex64, complex128

What’s this list? Why can we multiply 2 matrices of int32 but not of int64?

Let’s go solve this understanding what’s going on.

Lesson 3: Tensorflow typing system

Let’s dig into the source code looking for the C++ declaration of the MatMul operation:

    .Input("a: T")
    .Input("b: T")
    .Output("product: T")
    .Attr("transpose_a: bool = false")
    .Attr("transpose_b: bool = false")
    .Attr("T: {half, float, double, int32, complex64, complex128}")
Multiply the matrix "a" by the matrix "b".
The inputs must be two-dimensional matrices and the inner dimension of
"a" (after being transposed if transpose_a is true) must match the
outer dimension of "b" (after being transposed if transposed_b is
*Note*: The default kernel implementation for MatMul on GPUs uses
transpose_a: If true, "a" is transposed before multiplication.
transpose_b: If true, "b" is transposed before multiplication.

This line defines an interface for the MatMul operation: in particular, we can see that using the REGISTER_OP macro we’re declaring the op’s:

  • Name: MatMul
  • Parameters: a, b
  • Attributes (optional parameters): transpose_a, transpose_b
  • Supported types for the template T: half, float, double, int32, complex64, complex128
  • Output Shape: automatically inferred
  • Documentation

This macro call doesn’t contain any C++ code, but it tells us that when defining an operation, even though it uses a template, we have to specify the list of the supported type for the specified type T (or attribute). In practice, the attribute .Attr("T: {half, float, double, int32, complex64, complex128}") constraints the type T to be one value of that list.

As we can read from the tutorial even when using a Template T we have to explicitly register the kernel for every supported overload. The kernel is the CUDA-way for referring to C/C++ functions that will be executed in parallel.

The MatMul author, thus, decided to support only the previously listed types and to not support the int64. There are 2 possible reasons:

  1. Oversight: it’s possible, the Tensorflow’s authors are humans!
  2. Supporting devices where int64 operations are not fully supported, thus this specific implementation of the kernel wouldn’t be general enough to run on every supported hardware.

Coming back to our panic: the fix is obvious. We have to pass parameters of a supported type to MatMul.

Let’s create attempt3.go changing every line that refers to an int64 with an int32.

There’s just a thing to note: the Go bindings have their own set of types, with a 1:1 mapping (almost complete) with the Go types. When we feed values into the graph we have to respect the mapping (for instance feed int32 when defining tf.Int32 placeholders). The same thing has to be done when fetching values from the graph. The *tf.Tensor type returned from a Tensor evaluation, has the Value() method that returns an interface{} that must be converted to the correct type (that we know from the graph construction).

Usual go run attempt3.go. Results

input/Placeholder input_1/Placeholder
[[210] [-210]]


Here’s a Gist with the complete code of attempt3 in case you’d like too build and run it (also remember that’s a Gist, you can contribute if you see something that can be improved!)

Question time

What do we have learned about the Tensorflow architecture? Every operation has its own set of kernel associated. Tensorflow, seen as a descriptive language, is a strong typed language. It not only has to respect the C++ typing rules, but it also has the capability of implementing only certain types that are specified during the op regisration phase.


Using Go for defining and executing a graph gave us the opportunity for a better understanding of the underlying Tensorflow structure. Using a trial-and-error approach we solved this simple problem and step by step we learned something new about the graph, its nodes and the typing system.

If you find this article useful, feel free to share it using the buttons below!

Don't you want to miss the next article? Do you want to be kept updated?
Subscribe to the newsletter!

Related Posts

FaceCTRL: control your media player with your face

After being interrupted dozens of times a day while coding with my headphones on, I decided to find a solution that eliminates the stress of pausing and re-playing the song I was listening to. The solution is machine learning / computer vision application developed with TensorFlow 2, OpenCV, and Playerctl. This article will guide you trough the step required to develop such an application.

Hands-On Neural Networks with TensorFlow 2.0

The first book on TensorFlow 2.0 and neural networks is out now!

Analyzing tf.function to discover AutoGraph strengths and subtleties - part 3

In this third and last part, we analyze what happens when tf.function is used to convert a function that contains complex Python constructs in its body. Should we design functions thinking about how they are going to be converted?

Analyzing tf.function to discover AutoGraph strengths and subtleties - part 2

In part 1 we learned how to convert a 1.x code to its eager version, the eager version to its graph representation and faced the problems that arise when working with functions that create a state. In this second part, we’ll analyze what happens when instead of a tf.Variable we pass a tf.Tensor or a Python native type as input to a tf.function decorated function. Are we sure everything is going to be converted to the Graph representation we expect?

Analyzing tf.function to discover AutoGraph strengths and subtleties - part 1

AutoGraph is one of the most exciting new features of Tensorflow 2.0: it allows transforming a subset of Python syntax into its portable, high-performance and language agnostic graph representation bridging the gap between Tensorflow 1.x and the 2.0 release based on eager execution. As often happens all that glitters is not gold: although powerful, AutoGraph hides some subtlety that is worth knowing; this article will guide you through them using an error-driven approach.

Tensorflow 2.0: Keras is not (yet) a simplified interface to Tensorflow

In Tensorflow 2.0 Keras will be the default high-level API for building and training machine learning models, hence complete compatibility between a model defined using the old tf.layers and the new tf.keras.layers is expected. In version 2 of the popular machine learning framework the eager execution will be enabled by default although the static graph definition + session execution will be still supported. In this post, you'll see that the compatibility between a model defined using tf.layers and tf.keras.layers is not always guaranteed.

Fixed camera setup for object localization and measurement

A common task in Computer Vision is to use a camera for localize and measure certain objects in the scene. In the industry is common to use images of objects on a high contrast background and use Computer Vision algorithms to extract useful information. There's a lot of literature about the computer vision algorithm that we can use to extract the information, but something that's usually neglected is how to correctly setup the camera in order to correctly address the problem. This post aim is to shed light on this subject.

Tensorflow 2.0: models migration and new design

Tensorflow 2.0 will be a major milestone for the most popular machine learning framework: lots of changes are coming, and all with the aim of making ML accessible to everyone. These changes, however, require for the old users to completely re-learn how to use the framework: this article describes all the (known) differences between the 1.x and 2.x version, focusing on the change of mindset required and highlighting the pros and cons of the new implementation.

Understanding Tensorflow's tensors shape: static and dynamic

Describing computational graphs is just a matter connecting nodes correctly. Connecting nodes seems a trivial operation, but it hides some difficulties related to the shape of tensors. This article will guide you through the concept of tensor's shape in both its variants: static and dynamic.

Camera calibration guidelines

The process of geometric camera calibration (camera resectioning) is a fundamental step for machine vision and robotics applications. Unfortunately, the result of the calibration process can vary a lot depending on various factors. There are a lot of empirical guidelines that have to be followed in order to achieve good results: this post will drive you through them.