PyTorch by deeplizard Part 1: PyTorch and Tensors


这个是第一部分笔记,主要介绍了PyTorch中Tensor的用法。

Section 1: Introducing PyTorch

PyTorch Prerequisites - Syllabus for Neural Network Programming Course

PyTorch Explained - Python Deep Learning Neural Network API

img

PyTorch Install - Quick and Easy

快读验证安装结果:

In [1]: import torch                                                
In [2]: torch.__version__         
Out[2]: '1.1.0'

In [3]: torch.cuda.is_available()     
Out[3]: False

In [4]: torch.version.cuda 

CUDA Explained - Why Deep Learning uses GPUs

GPU的强项在于并行计算,其他场景使用CPU更合适。

GPU is only faster for particular (specialized) tasks. One issue that we can run into is bottlenecks that slow our performance. For example, moving data from the CPU to the GPU is costly, so in this case, the overall performance might be slower if the computation task is a simple one.

img

可以看到每一层是逐渐叠加的。

Section 2: Introducing Tensors

Tensors Explained - Data Structures of Deep Learning

  • number
  • scalar
  • array
  • vector
  • 2d-array
  • matrix

这些都是tensor,可以分成下面两类:

img

一类称呼在CS里常用,另一类称呼在Math领域常用。

还有一种分类方法,按照索引一个元素时需要用到的索引值来分类:

img

我们不要用vector,matrix这样的称呼,直接用multiple-array,或nd-array:

img

Rank, Axes, and Shape Explained - Tensors for Deep Learning

Rank代表一个tensor有多少维

Rank代表一个tensor有多少维:

img

另外Rank还告诉我们,找到一个具体元素的时候,需要多少个index:

img

img

Axes代表tensor中一个特定的维度

img

比如t的第一维的长度是3,那就我们就只能访问3个元素。t的第二维长度是4,那么就可以访问4个元素

img

Shape代表每个axis上的长度

Shape告诉我们每个axis上有多少个元素

img

img

Shape还告诉我们tensor的rank是多少:

In [8]: n.reshape(2, 5)                     
Out[8]: 
tensor([[0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9]])

In [12]: len(n.reshape(2,5))                
Out[12]: 2

shape非常重要,它涉及到了rank和axis。

CNN Tensor Shape Explained - Convolutional Neural Networks and Feature Maps

CNN的输入是一个rank-4的张量,即有4个axes。shape的每个index代表一个具体的axis,shape的每个value代表那个axis上的长度:

img

最后两位是高度和宽度:

img

第二维上是color channel。如果是黑白的话,则是1

img

第一维上是有多少个batch

img

通过下面的shape,我们可以知道,一个batch有3个样本,颜色通道是1,图片大小是28x28。

img

假设一个[1,1,28,28]的图片输入到CNN里,输出的结果里,color channel的大小取决于filter的大小,下面的例子中用了3个filter:

img

这部分的动画真得很棒,直接看一下视频比较好

结果得到了3个feature map:

img

Section 3: PyTorch Tensors

PyTorch Tensors Explained - Neural Network Programming

Instances of the torch.Tensor class

PyTorch tensors are instances of the torch.Tensor Python class. We can create a torch.Tensor object using the class constructor like so:

> t = torch.Tensor()
> type(t)
torch.Tensor

This creates an empty tensor (tensor with no data), but we’ll get to adding data in just a moment.

Tensor attributes

First, let’s look at a few tensor attributes. Every torch.Tensor has these attributes:

  • torch.dtype
  • torch.device
  • torch.layout

Looking at our Tensor t, we can see the following default attribute values:

> print(t.dtype)
> print(t.device)
> print(t.layout)
torch.float32
cpu
torch.strided

这三个属性很重要。类型,设备,步幅。

Tensors have a torch.dtype

The dtype, which is torch.float32 in our case, specifies the type of the data that is contained within the tensor. Tensors contain uniform (of the same type) numerical data with one of these types:

Data type dtype CPU tensor GPU tensor
32-bit floating point torch.float32 torch.FloatTensor torch.cuda.FloatTensor
64-bit floating point torch.float64 torch.DoubleTensor torch.cuda.DoubleTensor
16-bit floating point torch.float16 torch.HalfTensor torch.cuda.HalfTensor
8-bit integer (unsigned) torch.uint8 torch.ByteTensor torch.cuda.ByteTensor
8-bit integer (signed) torch.int8 torch.CharTensor torch.cuda.CharTensor
16-bit integer (signed) torch.int16 torch.ShortTensor torch.cuda.ShortTensor
32-bit integer (signed) torch.int32 torch.IntTensor torch.cuda.IntTensor
64-bit integer (signed) torch.int64 torch.LongTensor torch.cuda.LongTensor

Notice how each type has a CPU and GPU version. One thing to keep in mind about tensor data types is that tensor operations between tensors must happen between tensors with the same type of data.

Tensors have a torch.device

指定计算在哪个设备上进行,下面的index表示哪个设备。

The device, cpu in our case, specifies the device (CPU or GPU) where the tensor’s data is allocated. This determines where tensor computations for the given tensor will be performed.

PyTorch supports the use of multiple devices, and they are specified using an index like so:

> device = torch.device('cuda:0')
> device
device(type='cuda', index=0)

If we have a device like above, we can create a tensor on the device by passing the device to the tensor’s constructor. One thing to keep in mind about using multiple devices is that tensor operations between tensors must happen between tensors that exists on the same device.

Using multiple devices is typically something we will do as we become more advanced users, so there’s no need to worry about that now.

Tensors have a torch.layout

layout告诉我们tensor在memory上是如何进行保存的,这次的例子中保存方式是strided。

The layout, strided in our case, specifies how the tensor is stored in memory. To learn more about stride check here.

For now, this is all we need to know.

Take away from the tensor attributes

在属性方面需要记住下面两件事:

  1. 张量包含统一的类型
  2. 张量的计算取决于dtype和device

比如下面不同类型的tensor无法计算:
img

比如在不同设备上的tensor无法计算:
img

As neural network programmers, we need to be aware of the following:

  1. Tensors contain data of a uniform type (dtype).
  2. Tensor computations between tensors depend on the dtype and the device.

Let’s look now at the common ways of creating tensors using data in PyTorch.

Creating tensors using data

使用现有的数据创建tensor

These are the primary ways of creating tensor objects (instances of the torch.Tensor class), with data (array-like) in PyTorch:

  1. torch.Tensor(data)
  2. torch.tensor(data)
  3. torch.as_tensor(data)
  4. torch.from_numpy(data)

Let’s look at each of these. They all accept some form of data and give us an instance of the torch.Tensor class. Sometimes when there are multiple ways to achieve the same result, things can get confusing, so let’s break this down.

img

torch.Tensor是class,其他三个都是function。可以看到class返回的类型是float,而function返回的类型和numpy里设定的类型一样,都是int:

Creation options without data

Here are some other creation options that are available.

img

We have the torch.eye() function which returns a 2-D tensor with ones on the diagonal and zeros elsewhere. The name eye() is connected to the idea of an identity matrix , which is a square matrix with ones on the main diagonal and zeros everywhere else.

> print(torch.eye(2))
tensor([
    [1., 0.],
    [0., 1.]
])

> print(torch.zeros([2,2]))
tensor([
    [0., 0.],
    [0., 0.]
])

> print(torch.ones([2,2]))
tensor([
    [1., 1.],
    [1., 1.]
])

> print(torch.rand([2,2]))
tensor([
    [0.0465, 0.4557],
    [0.6596, 0.0941]
])

Creating PyTorch Tensors for Deep Learning - Best Options

解释上一节中四种创建tensor方法有什么不同。

img

  1. torch.Tensor(data): class constructor
  2. torch.tensor(data): factory function
  3. torch.as_tensor(data): factory function
  4. torch.from_numpy(data): factory function

img

现在我们更推荐使用factory function。可以看到class constructor的创建方法,默认创建float32类型。

img

The difference here arises in the fact that the torch.Tensor() constructor uses the default dtype when building the tensor. We can verify the default dtype using the torch.get_default_dtype() method:

> torch.get_default_dtype()
torch.float32

To verify with code, we can do this:

> o1.dtype == torch.get_default_dtype()
True

The other calls choose a dtype based on the incoming data. This is called type inference. The dtype is inferred based on the incoming data. Note that the dtype can also be explicitly set for these calls by specifying the dtype as an argument:

> torch.tensor(data, dtype=torch.float32)
> torch.as_tensor(data, dtype=torch.float32)

type inference可以自己推测传入的数据类型:

img

使用function比constructor的好处在于,function可以自己指定传入的类型:

img

With torch.Tensor(), we are unable to pass a dtype to the constructor. This is an example of the torch.Tensor() constructor lacking in configuration options. This is one of the reasons to go with the torch.tensor() factory function for creating our tensors.

Let’s look at the last hidden difference between these alternative creation methods.

Sharing memory for performance: copy vs share

The third difference is lurking behind the scenes or underneath the hood. To reveal the difference, we need to make a change to the original input data in the numpy.ndarray after using the ndarray to create our tensors.

Tensor()tensor()的创建方式会在内容里保存新的对象。但是as_tensor()from_numpy()的方法不会创建新的对象,只是numpy data的一个reference。

img

share比copy更有效,而且用更少的内存。

img

This happens because torch.Tensor() and torch.tensor() copy their input data while torch.as_tensor() and torch.from_numpy() share their input data in memory with the original input object.

Share Data Copy Data
torch.as_tensor() torch.tensor()
torch.from_numpy() torch.Tensor()

This sharing just means that the actual data in memory exists in a single place. As a result, any changes that occur in the underlying data will be reflected in both objects, the torch.Tensor and the numpy.ndarray.

Sharing data is more efficient and uses less memory than copying data because the data is not written to two locations in memory.

If we have a torch.Tensor and we want to convert it to a numpy.ndarray, we do it like so:

> print(o3.numpy())
> print(o4.numpy())
[0 2 3]
[0 2 3]

This gives:

> print(type(o3.numpy()))
> print(type(o4.numpy()))
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>

This establishes that torch.as_tensor() and torch.from_numpy() both share memory with their input data. However, which one should we use, and how are they different?

The torch.from_numpy() function only accepts numpy.ndarrays, while the torch.as_tensor() function accepts a wide variety of Python array-like objects including other PyTorch tensors. For this reason, torch.as_tensor() is the winning choice in the memory sharing game.

Best options for creating tensors in PyTorch

下面是创建tensor的最佳实践:首选copy的tensor(),然后是share的as_tensor()

Given all of these details, these two are the best options:

  • torch.tensor()
  • torch.as_tensor()

The torch.tensor() call is the sort of go-to call, while torch.as_tensor() should be employed when tuning our code for performance.

img

下面4个关于share的要点要记住:

  1. numpy array保存在CPU上,而as_tensor()必须把数据复制到GPU上才能用GPU
  2. as_tensor() 不能用于python的list
  3. 使用as_tensor()一定要意识到背后的share机制,注意原来的数据是否会变动
  4. 使用as_tensor()最佳场景在numpy array和tensor变换比较多的情况下。如果只是single load operation的话,使用copy的方法并不会影响性能

Some things to keep in mind about memory sharing (it works where it can):

  1. Since numpy.ndarray objects are allocated on the CPU, the as_tensor() function must copy the data from the CPU to the GPU when a GPU is being used.
  2. The memory sharing of as_tensor() doesn’t work with built-in Python data structures like lists.
  3. The as_tensor() call requires developer knowledge of the sharing feature. This is necessary so we don’t inadvertently make an unwanted change in the underlying data without realizing the change impacts multiple objects.
  4. The as_tensor() performance improvement will be greater when there are a lot of back and forth operations between numpy.ndarray objects and tensor objects. However, if there is just a single load operation, there shouldn’t be much impact from a performance perspective.

Section 4: Tensor Operations

Flatten, Reshape, and Squeeze Explained - Tensors for Deep Learning with PyTorch

Tensor operation types

关于tensor的操作,可以分成下面4个类型:

  1. Reshaping operations
  2. Element-wise operations
  3. Reduction operations
  4. Access operations

这一节和下一节是讲Reshaping operations的。

Tensor shape review

那下面的tensor距离

> t = torch.tensor([
    [1,1,1,1],
    [2,2,2,2],
    [3,3,3,3]
], dtype=torch.float32)

这是一个3x4,rank为2的tensor,rank这个词经常用。pytorch可以用下面两种方式得到shape:

> t.size()
torch.Size([3, 4])

> t.shape
torch.Size([3, 4])

tensor的 sizeshape 是一回事。

tensor的rank等于tensor shape的长度:

> len(t.shape)
2

tensor的元素数量等于shape的值相乘:

> torch.tensor(t.shape).prod()
tensor(12)

或者用t.numel()直接得到tensor的元素数量:

> t.numel()
12

Reshaping a tensor in PyTorch

使用reshape()改变shape,传入的可以是list,或直接是shape:

img

pytorch里还有另一个函数view()也能实现 reshape() 的效果。

Changing shape by squeezing and unsqueezing

另一种改变shape的方式是使用 squeezing and unsqueezing

  • Squeezing a tensor removes the dimensions or axes that have a length of one.
  • Unsqueezing a tensor adds a dimension with a length of one.

These functions allow us to expand or shrink the rank (number of dimensions) of our tensor. Let’s see this in action.

img

Flatten a tensor

flatten 创建一个1d-array of elements。Flattening a tensor means to remove all of the dimensions except for one.

下面是一个flatten()函数。注意使用flatten函数后,得到是一个一个1d array。而使用reshape之后,得到的是一个2d array。

img

Concatenating tensors

把两个tensor进行结合。We combine tensors using the cat() function, and the resulting tensor will have a shape that depends on the shape of the two input tensors.

Suppose we have two tensors:

> t1 = torch.tensor([
    [1,2],
    [3,4]
])
> t2 = torch.tensor([
    [5,6],
    [7,8]
])

We can combine t1 and t2 row-wise (axis-0) in the following way:

> torch.cat((t1, t2), dim=0)
tensor([[1, 2],
        [3, 4],
        [5, 6],
        [7, 8]])

We can combine them column-wise (axis-1) like this:

> torch.cat((t1, t2), dim=1)
tensor([[1, 2, 5, 6],
        [3, 4, 7, 8]])

When we concatenate tensors, we increase the number of elements contained within the resulting tensor. This causes the component values within the shape (lengths of the axes) to adjust to account for the additional elements.

> torch.cat((t1, t2), dim=0).shape
torch.Size([4, 2])

> torch.cat((t1, t2), dim=1).shape
torch.Size([2, 4])

CNN Flatten Operation Visualized - Tensor Batch Processing for Deep Learning

Flattening an entire tensor

要想flatten,那么tensor起码的有两个维度。如果高和宽是28x28,那么flatten的结果就是324。flatten通常用在将tensor传递从CNN传递给fully conneted laye人的时候。

img

下面是一个3x4x4的例子:

img

CNN的输入是(batch_size, color_channel, height, width), 我们使用reshape添加一维:

> t = t.reshape(3,1,4,4)

img

We have the first image.

> t[0]
tensor([[[1, 1, 1, 1],
         [1, 1, 1, 1],
         [1, 1, 1, 1],
         [1, 1, 1, 1]]])

We have the first color channel in the first image.

> t[0][0]
tensor([[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]])

We have the first first row of pixels in the first color channel of the first image.

> t[0][0][0]
tensor([1, 1, 1, 1])

We have the first pixel value in the first row of the first color channel of the first image.

> t[0][0][0][0]
tensor(1)

另外一些flatten整个tensor的方法:

> t.reshape(1,-1)[0] # Thank you Mick!
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
    2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])

> t.reshape(-1) # Thank you Aamir!
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
    2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])

> t.view(t.numel()) # Thank you Ulm!
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
    2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])

> t.flatten() # Thank you PyTorch!
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
    2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])
At the bott

Flattening specific axes of a tensor

Flattening the tensor batch

在使用CNN的时候,我们是以batch的形式进行计算的,所以不希望对batch所在的维度进行flatten。These axes need to be flattened: (C,H,W)。可以使用官方的flatten()函数:

> t.flatten(start_dim=1).shape
torch.Size([3, 16])

> t.flatten(start_dim=1)
tensor(
[
    [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
    [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
    [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]
]
)

Flattening an RGB Image

RGB的话有3个color channel:

r = torch.ones(1,2,2)
g = torch.ones(1,2,2) + 1
b = torch.ones(1,2,2) + 2

img = torch.cat(
    (r,g,b)
    ,dim=0
)

shape是3x2x2,3代表color channel:

> img.shape
torch.Size([3, 2, 2])

> img
tensor([
    [
        [1., 1.]
        ,[1., 1.]
    ]
    ,[
        [2., 2.]
        , [2., 2.]
    ],
    [
        [3., 3.]
        ,[3., 3.]
    ]
])

先flatten整个tensor:

> img.flatten(start_dim=0)
tensor([1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.])

保留color channel,flatten图片的高度和宽度:

> img.flatten(start_dim=1)
tensor([
    [1., 1., 1., 1.],
    [2., 2., 2., 2.],
    [3., 3., 3., 3.]
])

也就是说,如果我有一个batch为4的3x2x2的tensor。只保留batch的话,那就是一个4x12的tensor,每个长度为12的tensor里,每张图片都是按照不同color channel拼接在一起的。

Tensors for Deep Learning - Broadcasting and Element-wise Operations with PyTorch

关于tensor的操作,可以分成下面4个类型:

  1. Reshaping operations
  2. Element-wise operations
  3. Reduction operations
  4. Access operations

这一节讲Element-wise operations。

What does element-wise mean?

An element-wise operation operates on corresponding elements between tensors.

img

Arithmetic operations are element-wise operations

(1) Using these symbolic operations:

> print(t + 2)
tensor([[3., 4.],
        [5., 6.]])

> print(t - 2)
tensor([[-1.,  0.],
        [ 1.,  2.]])

> print(t * 2)
tensor([[2., 4.],
        [6., 8.]])

> print(t / 2)
tensor([[0.5000, 1.0000],
        [1.5000, 2.0000]])

or equivalently, (2) these built-in tensor object methods:

> print(t1.add(2))
tensor([[3., 4.],
        [5., 6.]])

> print(t1.sub(2))
tensor([[-1.,  0.],
        [ 1.,  2.]])

> print(t1.mul(2))
tensor([[2., 4.],
        [6., 8.]])

> print(t1.div(2))
tensor([[0.5000, 1.0000],
        [1.5000, 2.0000]])

Broadcasting tensors

Broadcasting is the concept whose implementation allows us to add scalars to higher dimensional tensors.

使用 numpy的 broadcast_to() 查看Broadcasting的效果:

> np.broadcast_to(2, t1.shape)
array([[2, 2],
        [2, 2]])

可以看到2被Broadcasting成了和t1一样的shape。下面可以看到t2被Broadcasting成了和t1一样的shape。

img

Comparison operations are element-wise

比较的操作都是element-wise的。

Comparison operations are also element-wise. For a given comparison operations between tensors, a new tensor of the same shape is returned with each element containing either a 0 or a 1.

  • 0 if the comparison between corresponding elements is False.
  • 1 if the comparison between corresponding elements is True.

Suppose we have the following tensor:

img

其实上面的比较操作背后是通过Broadcasting实现的:

img

Element-wise operations using functions

有一些function也是Element-wise的:

> t.abs() 
tensor([[0., 5., 0.],
        [6., 0., 7.],
        [0., 8., 0.]])


> t.sqrt()
tensor([[0.0000, 2.2361, 0.0000],
        [2.4495, 0.0000, 2.6458],
        [0.0000, 2.8284, 0.0000]])

> t.neg()
tensor([[-0., -5., -0.],
        [-6., -0., -7.],
        [-0., -8., -0.]])

> t.neg().abs()
tensor([[0., 5., 0.],
        [6., 0., 7.],
        [0., 8., 0.]])

Some terminology

下面3个说的是一回事:

  • Element-wise
  • Component-wise
  • Point-wise

Code for Deep Learning - ArgMax and Reduction Tensor Ops

  • Reshaping operations
  • Element-wise operations
  • Reduction operations
  • Access operations

这一次主要关注argmax()这个函数。

Tensor reduction operations

首先是reduction operations的定义:A reduction operation on a tensor is an operation that reduces the number of elements contained within the tensor.

Common tensor reduction operations

下面的一些函数都是把一个tensor变成一个scaler。

img

Reducing tensors by axes

但是通常的操作是在某个axis上进行的

img

我个人感觉下面的解释有点不好理解。应该这么看,axis=0的时候,该axis的长度是4,所以最后结果是每一列相加得到4。axis=1的时候axis的长度是3,所以最后结果是每一行相加得到3个group的和。

解析一下,dim=0的时候,就是把axis=0的所有element的全部相加。这里使用到了element-wise addition:

When we sum across the first axis, we are taking the summation of all the elements of the first axis

img

当dim=1的时候:

The second axis in this tensor contains numbers that come in groups of four. Since we have three groups of four numbers, we get three sums.

img

Argmax tensor reduction operation

Argmax returns the index location of the maximum value inside a tensor.

img

下面的max函数返回两个结果,第一个是最大值,第二个是最大值所在的index。而argmax只返回index:

img

下面是详细解释max里对应的最大值究竟在哪里:

img

Accessing elements inside tensors

  • Reshaping operations
  • Element-wise operations
  • Reduction operations
  • Access operations

最后一个操作就是要能访问tensor里的值。如果是scaler的话,item()方法。

img

如果是multiple value的话,可以使用tolist()和numpy()方法。

img

欢迎订阅我的博客:RSS feed
知乎: 赤乐君
Blog: BrambleXu
GitHub: BrambleXu
Medium: BrambleXu


文章作者: BrambleXu
版权声明: 本博客所有文章除特別声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来源 BrambleXu !
评论
  目录