第一次接触pytorch,本贴仅记录学习过程,侵删
在B站看完了视频的P10 10.卷积神经网络(基础篇)。
1、每一个卷积核的通道数量n,要求和输入通道是一样的;
2、这种卷积核的个数m与输出通道的数量是一致的;
3、卷积核大小kernel_size(width)×kernel_size(height)可自行决定(与图像大小无关)。
用pytorch来实现,举例:
import torch
in_channels, out_channels = 5, 10
width, height = 100, 100
kernel_size = 3
batch_size = 1
input = torch.randn(batch_size,
in_channels,
width,
height)
conv_layer = torch.nn.Conv2d(in_channels,
out_channels,
kernel_size=kernel_size)
output = conv_layer(input)
print(input.shape)
print(output.shape)
print(conv_layer.weight.shape)
在这里,输入维度为1×5×100×100,卷积核是10×5×3×3,输出维度为1×10×98×98
介绍几个卷积层中常见的参数、操作:
1、padding
当我们需要输入大小和输出大小一样时,我们可以使用padding来进行填充。
例:padding=1
对于这个input来说,如果要和3×3的卷积核做卷积,只需要padding=1就足够了(即3/2=1);若是要和5×5的卷积核做卷积,则需要padding=2(即5/2=2)。
例:padding=2
将padding=1的例子来实践一下:
import torch
input = [3, 4, 6, 5, 7,
2, 4, 6, 8, 2,
1, 6, 7, 8, 4,
9, 7, 4, 6, 2,
3, 7, 5, 4, 1]
input = torch.Tensor(input).view(1, 1, 5, 5)
# 参数分别为(batch_size, C, W, H)
conv_layer = torch.nn.Conv2d(1, 1, kernel_size=3, padding=1, bias=False)
# 1, 1分别为输入通道数和输出通道数
kernel = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]).view(1, 1, 3, 3)
# view的参数分别为(输入通道数, 输出通道数, W, H)
conv_layer.weight.data = kernel.data
output = conv_layer(input)
print(output)
2、stride(步长)
stride=2举例:
import torch
input = [3, 4, 6, 5, 7,
2, 4, 6, 8, 2,
1, 6, 7, 8, 4,
9, 7, 4, 6, 2,
3, 7, 5, 4, 1]
input = torch.Tensor(input).view(1, 1, 5, 5)
conv_layer = torch.nn.Conv2d(1, 1, kernel_size=3, stride=2, bias=False)
kernel = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]).view(1, 1, 3, 3)
conv_layer.weight.data = kernel.data
output = conv_layer(input)
print(output)
3、Downsampling(下采样)
用的较多的一种下采样为Max Pooling Layer(最大池化层)
Max Pooling Layer是没有权重的
例如:2×2的MaxPooling,默认的步长stride=2
将其分为2×2一组,在每一组里找最大值,再将最大值拼成2×2的新输出
实现:
import torch
input = [3, 4, 6, 5,
2, 4, 6, 8,
1, 6, 7, 8,
9, 7, 4, 6]
input = torch.Tensor(input).view(1, 1, 4, 4)
maxpooling_layer = torch.nn.MaxPool2d(kernel_size=2)
output = maxpooling_layer(input)
print(output)
用一个简单的卷积神经网络来实现MNIST:
在构造神经网络时,我们必须要保证每一层的数据维度能够对上。但在卷积层和池化层不在乎输入的大小,而最后的分类器比较在意输入大小。所以我们需要算出在使用了这些卷积结构之后,到分类器时,对于每一个样本来说,它的元素一共有多少个。
因为我们需要做交叉熵损失,所以最后一层不用激活。
代码实现:
class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=5)
self.pooling = torch.nn.MaxPool2d(2)
self.fc = torch.nn.Linear(320, 10)
def forward(self, x):
batch_size = x.size(0)
x = F.relu(self.pooling(self.conv1(x)))
x = F.relu(self.pooling(self.conv2(x)))
x = x.view(batch_size, -1)
x = self.fc(x)
return x
model = Net()
如何使用显卡GPU来计算?
1、Move Model to GPU
(1)Define device as the first visible cuda device if we have CUDA available
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
(2)Convert parameters and buffers of all modules to CUDA Tensor.
model.to(device)
即:
class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=5)
self.pooling = torch.nn.MaxPool2d(2)
self.fc = torch.nn.Linear(320, 10)
def forward(self, x):
batch_size = x.size(0)
x = F.relu(self.pooling(self.conv1(x)))
x = F.relu(self.pooling(self.conv2(x)))
x = x.view(batch_size, -1)
x = self.fc(x)
return x
model = Net()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)
2、Move Tensors to GPU
Send the inputs and targets at every step to the GPU.
(1)在train()中:
inputs, target = inputs.to(device), target.to(device)
(2)在test()中:
inputs, target = inputs.to(device), target.to(device)
最后,贴出使用这节课所学的知识实现的MNIST:
import torch
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision import datasets
import torch.nn.functional as F
import torch.optim as optim
batch_size = 64
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])
train_dataset = datasets.MNIST(root='../dataset/mnist/',
train=True,
download=True,
transform=transform)
train_loader = DataLoader(train_dataset,
shuffle=True,
batch_size=batch_size)
test_dataset = datasets.MNIST(root='../dataset/mnist/',
train=False,
download=True,
transform=transform)
test_loader = DataLoader(train_dataset,
shuffle=False,
batch_size=batch_size)
class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=5)
self.pooling = torch.nn.MaxPool2d(2)
self.fc = torch.nn.Linear(320, 10)
def forward(self, x):
batch_size = x.size(0)
x = F.relu(self.pooling(self.conv1(x)))
x = F.relu(self.pooling(self.conv2(x)))
x = x.view(batch_size, -1)
x = self.fc(x)
return x
model = Net()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)
criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
def train(epoch):
running_loss = 0.0
for batch_idx, data in enumerate(train_loader, 0):
inputs, target = data
inputs, target = inputs.to(device), target.to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, target)
loss.backward()
optimizer.step()
running_loss += loss.item()
if batch_idx % 300 == 299:
print('[%d, %5d] loss:%.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
running_loss = 0.0
def test():
correct = 0
total = 0
with torch.no_grad():
for data in test_loader:
images, labels = data
images, labels = images.to(device), labels.to(device)
outputs = model(images)
_, predicted = torch.max(outputs.data, dim=1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy on test set:%d %%' % (100 * correct / total))
if __name__ == '__main__':
for epoch in range(10):
train(epoch)
test()
作业: