Tensors
Every foundational operation in TensorFlow interacts with a single type of data structure called a tensor. This page explores the concept of the tensor data structure, and demonstrates how to create and work with tensors in TensorFlow.
Even if you have never heard of tensors before, the concept will probably be somewhat familiar. This is because tensors are a generalized form of some other more common data structures.
A tensor is an \( N \)-dimensional array of values. For example, you might think of the data structure below as a list or a vector, but it can also be considered a 1-dimensional tensor.
Similarly, a matrix can be considered a 2-dimensional tensor.
A tensor can have any number of dimensions, which means that you will often work with tensors that have 3, 4, or even more dimensions.
A 3-dimensional tensor can be thought of as a list of matrices, or as values arranged in a 3-dimensional cube; a 4-dimensional tensor can be thought of as a list of such cubes, and so on to infinity.
Tensors in TensorFlow
In TensorFlow, the tf.Tensor
class is used to represent the tensor data structure. You can create a new instance of tf.Tensor
with constant values by calling tf.constant()
:
import tensorflow as tf
# creates a 1-dimensional tensor
tensor_1d = tf.constant([1, 2, 3, 4])
Each tf.Tensor
object has 4 properties that describe its state:
Value
The data stored in the tensor object
Rank
The number of dimensions of the tensor
Shape
The number of values within each of the dimensions
Data Type
The data type used to store the values
Value
Arguably the most important property of any tensor object is the actual data stored within it. The values stored in an instance of tf.Tensor
can be viewed by printing the object:
tensor_1d = tf.constant([1, 2, 3, 4])
print(tensor_1d)
tf.Tensor([1 2 3 4], shape=(4,), dtype=int32)
It is also possible to convert the value of a tf.Tensor
object into a Numpy array by calling the numpy()
method:
numpy_array = tensor_1d.numpy()
print(numpy_array)
[1 2 3 4]
Tensor data elements be accesed using the standard Python syntax for accessing elements or slices of a list:
tensor_1d = tf.constant([1, 2, 3, 4])
print(f"Element at tensor_1d[0]: {tensor_1d[0]}")
tensor_2d = tf.constant([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
# accessing index 0 in a 2-dimensional tensor returns the entire first row
print(f"Element at tensor_2d[0]: {tensor_2d[0]}")
# accessing tensor_2d[0, 1] gives us the second element in the first row
print(f"Element at tensor_2d[0, 1]: {tensor_2d[0, 1]}")
# accessing tensor_2d[:, 1] gives us the entire second column
print(f"Element at tensor_2d[:, 1]: {tensor_2d[:, 1]}")
Element at tensor_1d[0]: 1
Element at tensor_2d[0]: [1 2 3]
Element at tensor_2d[0, 1]: 2
Element at tensor_2d[:, 1]: [2 5 8]
Rank
The rank of a tensor describes the number of different dimensions that it has. For example, a vector (or 1-dimensional tensor) has rank 1, a matrix (or 2-dimensional tensor) has rank 2, and so on.
The ndim
property of the tf.Tensor
object is used to get the rank of a tensor:
tensor_1d = tf.constant([1, 2, 3, 4])
print(f"Rank of tensor_1d: {tensor_1d.ndim}")
tensor_2d = tf.constant([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
print(f"Rank of tensor_2d: {tensor_2d.ndim}")
Rank of tensor_1d: 1
Rank of tensor_2d: 2
Shape
The shape of a tensor describes the number of elements in each of its dimensions.
For example, consider the matrix, or 2-dimensional tensor, from earlier. This tensor has two dimensions which can be thought of visually as rows and columns.
tensor_2d = tf.constant([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
print(f"Shape of tensor_2d: {tensor_2d.shape}")
Shape of tensor_2d: (3, 3)
The shape of this tensor is (3,3)
because there are 3 values in the first dimension (i.e. 3 rows), and 3 values in the second dimension (i.e. 3 columns).
When considering a 3-dimensional tensor like the one below, you might think of the three dimensions as the number of matrices, the number of rows in each matrix, and the number of columns in each matrix.
tensor_3d = tf.constant([
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]
],
[
[13, 14, 15, 16],
[17, 18, 19, 20],
[21, 22, 23, 24]
]
])
print(f"Shape of tensor_3d: {tensor_3d.shape}")
Shape of tensor_3d: (2, 3, 4)
The shape of this particular 3-dimensional tensor is (2, 3, 4)
because there are 2 matrices, where each matrix has 3 rows and 4 columns.
Data Type
The last attribute of every tensor object is its data type. In Tensorflow, every element in a tensor must have the same data type.
By default, when creating a new tensor, the data type is inferred based on the values provided. For example, Python integer values are converted to the tf.int32
data type, and python floating point numbers are converted to the tf.float32
data type.
int_tensor = tf.constant([1, 2, 3])
print(f"int_tensor: {int_tensor}")
float_tensor = tf.constant([1.0, 2.0, 3.0])
print(f"float_tensor: {float_tensor}")
int_tensor: [1 2 3]
float_tensor: [1. 2. 3.]
This default behavior can be overridden by passing the dtype
parameter when creating the tensor, as long as the provided data and data type are compatible.
int8_tensor = tf.constant([1, 2, 3], dtype=tf.int8)
print(f"int8_tensor: {int8_tensor}")
invalid_tensor = tf.constant([1, 2, 3.5], dtype=tf.int8)
int8_tensor: [1 2 3]
---------------------------------------------------------------------------
...
TypeError: Cannot convert [1, 2, 3.5] to EagerTensor of dtype int8
The full list of supported data types can be found in the official TensorFlow documentation for tf.dtypes
Creating Tensors
All of the examples shown so far begin by creating a tensor from a python list using tf.constant()
. There are several other ways to create tensors that can be especially useful when working with multidimensional data.
Zeros and Ones
The tf.zeros()
and tf.ones()
functions are used to create tensors filled with 0s and 1s, respectively. Each function accepts the shape and data type of the tensor as arguments.
# Create a 3-dimensional tensor filled with zeros
print("Zeros:")
tensor_3d_zeros = tf.zeros(shape=(2,3,4))
print(tensor_3d_zeros)
# Create a 4-dimensional tensor filled with ones
# Set dtype=tf.int32 to use integers instead of floats
print("\nOnes:")
tensor_4d_ones = tf.ones(shape=(2,2,2,2), dtype=tf.int32)
print(tensor_4d_ones)
Zeros:
tf.Tensor(
[[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]], shape=(2, 3, 4), dtype=float32)
Ones:
tf.Tensor(
[[[[1 1]
[1 1]]
[[1 1]
[1 1]]]
[[[1 1]
[1 1]]
[[1 1]
[1 1]]]], shape=(2, 2, 2, 2), dtype=int32)
Random Values
When developing and testing, it can often be useful to create tensors filled with random data of a particular shape and type.
The tf.random.uniform()
function is used to initialize a new tensor filled with random values.
random_ints = tf.random.uniform(
shape=(2,2),
minval=0,
maxval=100,
dtype=tf.int32
)
print("Random integers:")
print(random_ints)
random_floats = tf.random.uniform(
shape=(2,2),
minval=0,
maxval=100,
dtype=tf.float32
)
print("\nRandom floats:")
print(random_floats)
Random integers:
tf.Tensor(
[[54 3]
[91 46]], shape=(2, 2), dtype=int32)
Random floats:
tf.Tensor(
[[34.57055 52.068935]
[54.988945 43.07728 ]], shape=(2, 2), dtype=float32)
Alternatively, the tf.random.normal()
function can be used to create tensors filled with random values drawn from a normal distribution (i.e. a bell curve). The mean
and stddev
parameters are used to control the center and spread of the bell curve.
normal_random_floats = tf.random.normal(
shape=(2,3),
mean=10.0,
stddev=1.0,
dtype=tf.float32
)
print("Normally distributed floats:")
print(normal_random_floats)
Normally distributed floats:
tf.Tensor(
[[ 8.761594 8.470037 10.2857895]
[12.014274 11.470676 9.524365 ]], shape=(2, 3), dtype=float32)
Tensor Operations
So far, we have only looked at what tensors are and how to create them using TensorFlow. A big part of what makes TensorFlow so powerful is the huge number of built-in operations that can be used to manipulate data stored in tensors.
In this section, we will look at some of the most common tensor operations.
Element-wise Arithmetic
TensorFlow provides functions to perform basic arithmetic with tensors. For example, the tf.add()
function can be used to perform addition. This function computes an “element-wise” sum of two tensors, meaning that, to compute the result, each element in the first tensor is added to the corresponding element in the second tensor.
tensor_a = tf.constant([1, 2, 3, 4])
tensor_b = tf.constant([2, 4, 6, 8])
tensor_sum = tf.add(tensor_a, tensor_b)
print("Sum:")
print(tensor_sum)
Sum:
tf.Tensor([ 3 6 9 12], shape=(4,), dtype=int32)
Similarly, the tf.subtract()
, tf.multiply()
, and tf.divide()
functions can be used to perform element-wise subtraction, multiplication, and division, respectively.
As a shorthand, we can also use Python’s +
, -
, *
, and /
operators to perform element-wise arithmetic with tensors. For example, the following code shows element-wise multiplication using the *
operator:
tensor_a = tf.constant([1, 2, 3, 4])
tensor_b = tf.constant([2, 4, 6, 8])
tensor_product = tensor_a * tensor_b
print("Product:")
print(tensor_product)
Product:
tf.Tensor([ 2 8 18 32], shape=(4,), dtype=int32)
Broadcasting
In the previous examples of element-wise arithmetic, both of the tensors involved in the operations had the same shape. It is also possible to perform operations on tensors with differing shapes via a feature of TensorFlow called broadcasting.
Broadcasting works by “stretching” the smaller tensor to match the shape of the larger tensor. A common use for broadcasting is to perform some constant operation on every element in a tensor.
Consider the example below, which halves the value of each element in tensor_a
using element-wise division with Broadcasting:
tensor_a = tf.constant([2, 4, 6, 8])
tensor_b = tf.constant(2)
tensor_half = tensor_a / tensor_b
print("Halved tensor:")
print(tensor_half)
Halved tensor:
tf.Tensor([1. 2. 3. 4.], shape=(4,), dtype=float64)
Changing Shape
Sometimes it is necessary to explicitly modify the shape of a tensor in order to use it in a subsequent operation. TensorFlow provides a few different ways to change the shape of a tensor, the simplest of which is to call the tf.reshape()
function and pass in the desired shape.
tensor_a = tf.constant([1, 2, 3, 4, 5, 6, 7, 8])
tensor_reshaped_2_4 = tf.reshape(tensor_a, (2, 4))
print(f"Reshaped to (2,4):")
print(tensor_reshaped_2_4)
tensor_reshaped_4_2 = tf.reshape(tensor_a, (4, 2))
print(f"\nReshaped to (4,2):")
print(tensor_reshaped_4_2)
Reshaped to (2,4):
tf.Tensor(
[[1 2 3 4]
[5 6 7 8]], shape=(2, 4), dtype=int32)
Reshaped to (4,2):
tf.Tensor(
[[1 2]
[3 4]
[5 6]
[7 8]], shape=(4, 2), dtype=int32)
Tensor reshaping only works, however, if the new shape provided has the same total number of elements as the original shape. For example, tensor_a
cannot be reshaped to (3, 3)
because tensor_a
has 8 elements, while a tensor with shape (3, 3)
must have 9 elements.
tensor_a = tf.constant([1, 2, 3, 4, 5, 6, 7, 8])
tensor_reshaped_3_3 = tf.reshape(tensor_a, (3,3))
---------------------------------------------------------------------------
InvalidArgumentError: Input to reshape is a tensor with 8 values, but the
requested shape has 9 [Op:Reshape]
Two other functions commonly used to change the shape of a tensor are tf.squeeze()
and tf.expand_dims()
.
The tf.squeeze()
function removes any dimensions with a size of 1. Consider the following 3-dimensional tensor with shape (2, 1, 3)
. Calling tf.squeeze()
on this tensor will remove the first dimension, resulting in a tensor with shape (2, 3)
:
tensor_a = tf.constant([[[1, 2, 3]], [[4, 5, 6]]])
print(f"Squeezed:")
print(tf.squeeze(tensor_a))
Squeezed:
tf.Tensor(
[[1 2 3]
[4 5 6]], shape=(2, 3), dtype=int32)
The tf.expand_dims()
function adds a new dimension of size 1 at the specified axis. Here, axis refers to the index of a dimension in the tensor’s shape tuple.
Consider the following 2-dimensional tensor with shape (3, 2)
. Calling tf.expand_dims()
on this tensor with axis=0
will add a new dimension at the beginning of the tensor, resulting in a tensor with shape (1, 3, 2)
, whereas calling tf.expand_dims()
with axis=-1
will add a new dimension at the end of the tensor, resulting in a tensor with shape (3, 2, 1)
:
tensor_a = tf.constant([[1, 2], [3, 4], [5, 6]])
print("Expanded Dims with axis=0:")
print(tf.expand_dims(tensor_a, axis=0))
print("\nExpanded Dims with axis=-1:")
print(tf.expand_dims(tensor_a, axis=-1))
Expanded Dims with axis=0:
tf.Tensor(
[[[1 2]
[3 4]
[5 6]]], shape=(1, 3, 2), dtype=int32)
Expanded Dims with axis=-1:
tf.Tensor(
[[[1]
[2]]
[[3]
[4]]
[[5]
[6]]], shape=(3, 2, 1), dtype=int32)
Tensor Immutability
In all of the example shown so far, you might have noticed that whenever an operation is performed, the output of the operation is always captured as a new variable. For example:
# new variable operation existing variable
# ↓ ↓ ↓
tensor_results = tf.reshape(tensor_a, (2, 4))
This is because all tensors in Tensorflow are immutable. It is not possible to actually modify the values or the shape of an existing tensor; each operation simply returns a new tensor containing the result of the operation.
The immutability of tensors is evident if you examine any tensor after performing an operation on it:
tensor_a = tf.constant([1, 2, 3, 4, 5, 6, 7, 8])
tf.reshape(tensor_a, (2, 4))
print(tensor_a)
tf.Tensor([1 2 3 4 5 6 7 8], shape=(8,), dtype=int32)
The tf.reshape()
operation returns the reshaped tensor; however, tensor_a
itself remains unchanged.
Matrix Multiplication
When building Neural Networks, one of the most commonly used mathematical operations is Matrix Multiplication. Not to be confused with the element-wise multiplication mentioned earlier, Matrix Multiplication is performed by computing the Dot Product of each row in the first matrix with each column the second.
Dot Product
The Dot Product of two 1-dimensional tensors, or vectors, is computed by performing an element-wise multiplication operation between both tensors and then computing the sum of the results. In mathematical notation, \(A \cdot B\) is used to indicate the Dot Product of vectors \(A\) and \(B\). The formula for computing the Dot Product can be written as:
$$ A \cdot B = A_{1}B_{1} + A_{2}B_{2} + … + A_{n}B_{n} $$
The function tf.tensordot()
is used to compute the Dot Product of two tensors:
tensor_a = tf.constant([2, 4, 6, 8])
tensor_b = tf.constant([1, 2, 3, 4])
tensor_dot_product = tf.tensordot(tensor_a, tensor_b, axes=1)
print(tensor_dot_product)
tf.Tensor(60, shape=(), dtype=int32)
Matrix Multiplication Example
When computing the dot product between tensors, the result of the operation is a single value. When performing Matrix Multiplication, the result is itself a Matrix, where each element \([i,j]\) in the result contains the value of the Dot Product of the \(i^{th}\) row of the first matrix and the \(j^{th}\) column of the second matrix.
The tf.linalg.matmul()
function in TensorFlow is used to perform Matrix multiplication.
tensor_a = tf.constant([
[1, 2, 3],
[4, 5, 6]
])
tensor_b = tf.constant([
[2, 3],
[3, 4],
[2, 3]
])
result = tf.linalg.matmul(tensor_a, tensor_b)
print(result)
tf.Tensor(
[[14 20]
[35 50]], shape=(2, 2), dtype=int32)
×
=
Next Steps
Learn about Gradients in TensorFlow