一. Pytorch Basic
(一)简介
Pytorch是python中开源的一个机器学习库,类似tensorflow, keras, 可用于自然语言处理等应用,由Facebook 人工智能团队提出。加载cuda时,可使用GPU加速计算。
1. tensor basic
张量(tensor)可以简单地看作存储多维数据的容器。如下图所示0维张量是scalar,1维张量是vector,2维张量是matrix,高于2维称为tensor。
1.1 create a torch tensor
import torch
# create a torch tensor
t = torch.tensor([[1,2,3],[4,5,6]])
t
output:
tensor([[1, 2, 3],
[4, 5, 6]])
1.2 two ways for transpose(转置的两种方式)
第一种方式:t.t()
t.t()
output:
tensor([[1, 4],
[2, 5],
[3, 6]])
第二种方式:t.permute(-1,0)
t.permute(-1,0)
output:
tensor([[1, 4],
[2, 5],
[3, 6]])
1.3 reshape a tensor with view()
t.view(3,2)
output:
tensor([[1, 2],
[3, 4],
[5, 6]])
try another one:
a = t.view(6,1)
a
output:
tensor([[1],
[2],
[3],
[4],
[5],
[6]])
1.4 create tensor of zeros
t = torch.zeros(3,3)
t
output:
tensor([[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]])
1.5 create tensor from normal distribution randoms
t = torch.randn(3,3)
t
output:
tensor([[ 0.2511, -0.7670, -1.2358],
[-0.9764, -0.1060, 0.4308],
[-2.2955, 0.3311, -1.0970]])
1.6 some tensor information
print('tensor shape:',t.shape)
print('number of dimension:',t.dim())
print('tensor type:',t.type())
output:
tensor shape: torch.Size([3, 3])
number of dimension: 2
tensor type: torch.FloatTensor
1.7 sclicing like numpy
t = torch.tensor([[1,2,3],[4,5,6],[7,8,9]])
# every row, only the last column
print(t[:,-1])
# first 2 rows, all columns
print(t[:2,:])
# lower right most corner
print(t[-1:,-1:])
output:
tensor([3, 6, 9])
tensor([[1, 2, 3],
[4, 5, 6]])
tensor([[9]])
1.8 pytorch tensor to and from numpy ndarray
1) ndarray to tensor
import numpy as np
# ndarray to tensor
a = np.random.randn(2,3)
t = torch.from_numpy(a)
print(a)
print(t)
print(type(a))
print(type(t))
output:
[[ 0.65463612 -1.85520278 0.28951441]
[-1.11854953 0.92410894 1.71107649]]
tensor([[ 0.6546, -1.8552, 0.2895],
[-1.1185, 0.9241, 1.7111]], dtype=torch.float64)
<class 'numpy.ndarray'>
<class 'torch.Tensor'>
2) tensor to ndarray
# tensor to ndarray
t = torch.randn(2,3)
a = t.numpy()
print(t)
print(a)
print(type(t))
print(type(a))
output:
tensor([[ 0.1747, -0.2457, 2.4347],
[ 1.5476, 0.5925, -2.5421]])
[[ 0.17465861 -0.24565548 2.434704 ]
[ 1.5475734 0.59250295 -2.5421169 ]]
<class 'torch.Tensor'>
<class 'numpy.ndarray'>
1.9 basic tensor operations
1) 交叉积,外积,a x b = ab sin*
# compute cross product
t1 = torch.tensor([[1,2,3],[1,2,3]])
t2 = torch.tensor([[1,2,3],[4,5,6]])
t1.cross(t2)
output:
tensor([[ 0, 0, 0],
[-3, 6, -3]])
2) 矩阵乘积
# compute matrix product
t1 = torch.tensor([[2,4],[5,10]])
t2 = torch.tensor([[10],[20]])
t1.mm(t2)
output:
tensor([[100],
[250]])
3) 逐元素乘法
# elementwise multiplication
t = torch.tensor([[1,2],[3,4]])
t.mul(7)
output1:
tensor([[ 7, 14],
[21, 28]])
t.mul(t)
output2:
tensor([[ 1, 4],
[ 9, 16]])
1.10 GPU support
1)is cuda gpu available torch.cuda.is_available()
2)how many cuda devices torch.cuda.is_available()
3)move to gpu t.cuda()
torch.cuda.is_available()
output:
False
解释:我的电脑没有Nvidia的GPU,所以没法加载cuda运算。当计算量相当大运算时间过长时,可以考虑购买显卡,加速计算。
二. Back Propagation
反向传播是计算梯度
的一个很方便的算法,使用的基本思想是链式求导。在上一篇博客中,模型很简单:
手动求导:
每次更新:
但是当模型很复杂时,明显手动计算很麻烦,这时就要使用机器代替手工劳动力。
其中,pytorch自动求导机制,可以自动计算梯度并存储,只需要使用backward方法,就能得到梯度,方便直接使用。
由此,我们可以改进上一篇博客中关于学习时间与成绩绩点的简单例子。
(一)使用back propagation求梯度
1.完整程序
import torch
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
w = torch.tensor([1.0]) # initial w = 1
w.requires_grad = True # requires_grad = False by default
# our model forward pass
def forward(x):
return x*w
# loss function
def loss(x,y):
y_pred = forward(x)
return (y_pred-y)*(y_pred-y)
# before training
print("predict (before training)", "x = 4 ", "y =", forward(4).item())
# training loop
for epoch in range(10):
for x_val,y_val in zip(x_data,y_data):
l = loss(x_val,y_val)
l.backward()
print("\tgrad:",x_val,y_val,w.grad.item())
w.data = w.data - 0.01*w.grad.item()
# Manually zero the gradients after updating weights
w.grad.data.zero_()
print("progress: epoch:",epoch, "loss =", l.item())
#after training
print("predict(after training)", "x = 4 ", "y =", forward(4).item())
输出结果:
predict (before training) x = 4 y = 4.0
grad: 1.0 2.0 -2.0
grad: 2.0 4.0 -7.840000152587891
grad: 3.0 6.0 -16.228801727294922
progress: epoch: 0 loss = 7.315943717956543
grad: 1.0 2.0 -1.478623867034912
grad: 2.0 4.0 -5.796205520629883
grad: 3.0 6.0 -11.998146057128906
progress: epoch: 1 loss = 3.9987640380859375
:
:
:
grad: 1.0 2.0 -0.1319713592529297
grad: 2.0 4.0 -0.5173273086547852
grad: 3.0 6.0 -1.070866584777832
progress: epoch: 9 loss = 0.03185431286692619
predict(after training) x = 4 y = 7.804864406585693
2.分步解析
1)输入数据,参数初始化。
初始w = 1, 默认下requires_grad = False, 但是本模型中需要用到梯度,所以开启requires_grad = True.
注释中使用Variable创建w是pytorch以前版本的方式,新版本中可以不使用Variable,直接创建tensor,开启requires_grad = True即可。
当前,我们可以尝试输出w, w.data, w.data[0], w.item(), 比较它们的异同,方便后续引用。
import torch
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
w = torch.tensor([1.0]) # initial w = 1
w.requires_grad = True # requires_grad = False by default
# from torch.autograd import Variable
# w = Variable(torch.Tensor([1.0]),requires_grad = True)
# print(w) # tensor([1.], requires_grad=True)
# print(w.data) # tensor([1.]) 数据部分
# print(w.data[0]) # tensor(1.) 张量 1.0
# print(w.item()) # 1.0 float 1.0
2)模型建立
简单的前向模型:
# our model forward pass
def forward(x):
return x*w
3)损失函数
简单的MSE均方误差损失函数:
# loss function
def loss(x,y):
y_pred = forward(x)
return (y_pred-y)*(y_pred-y)
4)训练循环
显然,初始化w = 1时,x = 4, y =4。
每一次训练中,先计算损失函数,然后调用backward方法得出梯度,紧接着更新梯度。在一轮梯度更新完成后,将梯度初始化为0. 按照以上过程重复训练10次,得到较优的w。
训练完成后,测试 x = 4, y = 7.8048...
效果还可以。
# before training
print("predict (before training)", "x = 4 ", "y =", forward(4).item())
# training loop
for epoch in range(10):
for x_val,y_val in zip(x_data,y_data):
l = loss(x_val,y_val)
l.backward()
print("\tgrad:",x_val,y_val,w.grad.item())
w.data = w.data - 0.01*w.grad.item()
# Manually zero the gradients after updating weights
w.grad.data.zero_()
print("progress: epoch:",epoch, "loss =", l.item())
#after training
print("predict(after training)", "x = 4 ", "y =", forward(4).item())
(二)使用pytorch标准化模型构建
以上的例子中,只是使用了pytorch中的backward方法求梯度,其他诸如构建模型,构建损失函数等过程都是自己设置的。在pytorch中有一系列模块和方法帮助我们快速构建模型,统一模型搭建。
具体过程分为三步:
第一步:使用class类设计模型
第二步:构建损失函数与优化器(从 pytorch API 中选择)
第三步:训练循环(前向,后向,更新)
完整代码如下:
import torch
x_data = torch.Tensor([[1.0],[2.0],[3.0]])
y_data = torch.Tensor([[2.0],[4.0],[6.0]])
#################################################
# 01 design your model with class
class Model(torch.nn.Module):
def __init__(self):
super(Model, self).__init__()
self.linear = torch.nn.Linear(1, 1) # One in and one out
def forward(self, x):
y_pred = self.linear(x)
return y_pred
# our model
model = Model()
###################################################
# 02 construct loss and optimizer (select from Pytorch API)
criterion = torch.nn.MSELoss(reduction='mean') # mse损失函数
optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # 随机梯度下降
###################################################
# 03 training circle (forward, backward, update)
# Training loop
for epoch in range(500):
# Forward pass: Compute predicted y by passing x to the model
y_pred = model(x_data)
# Compute and print loss
loss = criterion(y_pred, y_data)
print(epoch, loss.item())
# Zero gradients, perform a backward pass, and update the weights.
optimizer.zero_grad() # initial grad with 0
loss.backward() # compute gradient w.grad
optimizer.step() # update w with w.grad, eg: w = w - 0.01*w.grad
# After training
hour_var = torch.Tensor([[4.0]])
print("predict(after training) ", 4.0, model.forward(hour_var).item())