Pytorch——LeNet和AlexNet实现
1 LeNet模型
1.1 基本结构
LeNet分为卷积层和全连接层两个部分。
- 卷积层块里的基本单位是卷积层后接最大池化层,卷积层用来识别图像里的空间模式。之后使用最大池化层则用于降低卷积层对于位置的敏感性。整个的卷积部分由这两个基本的单位重复堆叠构成。卷积层的输出形状为(Batch,通道数量,高,宽)。当卷积层快的输出传入到全连接层快的时候,全连接层快会将Batch中的每一个样本进行展开(flatten),也就是说,全连接层的输入是一个二维矩阵,第一个维度是batch_size,第二个维度是样本展开之后的向量表示,向量的长度为通道、高和宽的乘积。
1.2 代码展示
#encoding=utf-8
import time
import torch
import torch.nn as nn
import torchvision.datasets
import torchvision.transforms as transforms
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
class LeNet(nn.Module):
def __init__(self):
super(LeNet,self).__init__()
self.conv = nn.Sequential(
nn.Conv2d(1,6,5),
nn.Sigmoid(),
nn.MaxPool2d(2,2),
nn.Conv2d(6,16,5),
nn.Sigmoid(),
nn.MaxPool2d(2,2)
)
self.fc = nn.Sequential(
nn.Linear(16*4*4,120),
nn.Sigmoid(),
nn.Linear(120,84),
nn.Linear(84,10)
)
def forward(self,img):
feature = self.conv(img)
output = self.fc(feature.view(img.shape[0],-1))
return output
#定义准确率估计
def evaluate_accuracy(data_iter,net,device=None):
if device is None and isinstance(net,torch.nn.Module):
device = list(net.parameters())[0].device
acc_sum , n = 0.0 ,0
with torch.no_grad(): #在测试的时候不需要求导
for X,y in data_iter:
if isinstance(net,torch.nn.Module): #判断net是否是nn.Module类型
net.eval()
acc_sum += (net(X.to(device)).argmax(dim=1) == y.to(device)).float().sum().cpu().item()
net.train()
else:
if ('is_training' in net.__code__.co_varnames): # 如果有is_training这个参数
# 将is_training设置成False
acc_sum += (net(X, is_training=False).argmax(dim=1) == y).float().sum().item()
else:
acc_sum += (net(X).argmax(dim=1) == y).float().sum().item()
n += y.shape[0]
return acc_sum / n
# 定义训练函数
def train(net,train_iter,test_iter,batch_size,optimizer,device, num_epochs):
net = net.to(device)
print("training on ",device)
loss = torch.nn.CrossEntropyLoss()
for epoch in range(num_epoches):
train_loss_sum,train_acc_sum,n,batch_count,start = 0.0,0.0,0,0,time.time()
for X,y in train_iter:
X = X.to(device)
y = y.to(device)
y_hat = net(X)
loss_value = loss(y_hat,y)
optimizer.zero_grad()
loss_value.backward()
optimizer.step()
train_loss_sum += loss_value
train_acc_sum += (y_hat.argmax(dim=1) == y).sum().cpu().item()
n += y.shape[0]
batch_count += 1
test_acc = evaluate_accuracy(test_iter,net)
print("epoch %d,loss %.4f, train acc %.3f, test acc %.3f, time %.1f sec"
%(epoch + 1, loss_value / batch_count, train_acc_sum / n, test_acc, time.time() - start))
#获取数据集
batch_size = 256
mnist_train = torchvision.datasets.FashionMNIST(root='~/Datasets/FashionMNIST', train=True, download=True, transform=transforms.ToTensor())
mnist_test = torchvision.datasets.FashionMNIST(root='~/Datasets/FashionMNIST', train=False, download=True, transform=transforms.ToTensor())
train_iter = torch.utils.data.DataLoader(mnist_train, batch_size=batch_size, shuffle=True, num_workers=4)
test_iter = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False, num_workers=4)
# 定义超参数
lr = 0.001
num_epoches = 5
net = LeNet()
optimizer = torch.optim.Adam(net.parameters(),lr= lr)
if __name__ == '__main__':
train(net,train_iter,test_iter,batch_size,optimizer,device,num_epoches)
2 AlexNet
2.1 基本结构
AlexNet和LeNet的设计非常类似,但是在结构上存在一些差别。
- AlexNet的结构比LeNet规模更大,AlexNet包含8层变换,其中包含5层卷积和2层全连接隐藏层,以及最后的一个全连接的输出层。
- 在AlextNet中,第一层中的卷积核的size为1111.第二层中的卷积核的size为55,之后的卷积核的size全部为33.此外,第1,2,5个卷积层的之后都使用了size为33,步长为2的最大池化。
- 紧接着最后一个卷积层是两个输出个数位4096的全连接层。
- AlexNet采用了丢弃法来控制全连接层的模型复杂度。而LeNet没有使用丢弃。
- AlexNet引入了大量的图像变换,如旋转,裁剪,颜色变换等等,进一步扩大了数据集来缓解数据过拟合的问题。
2.2 代码实现
#encoding=utf-8
import time
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
class AlexNet(nn.Module):
def __init__(self):
super(AlexNet,self).__init__()
self.conv = nn.Sequential(
nn.Conv2d(1,96,11,4),
nn.ReLU(),
nn.MaxPool2d(3,2),
nn.Conv2d(96,256,5,1,2),
nn.ReLU(),
nn.MaxPool2d(3,2),
nn.Conv2d(256,384,3,1,1),
nn.ReLU(),
nn.Conv2d(384,384,3,1,1),
nn.ReLU(),
nn.Conv2d(384,256,3,1,1),
nn.ReLU(),
nn.MaxPool2d(3,2))
self.fc = nn.Sequential(
nn.Linear(256*5*5,4096),
nn.ReLU(),
nn.Dropout(0.5),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(4096,10)
)
def forward(self,img):
feature = self.conv(img)
output = self.fc(feature.view(img.shape[0],-1))
return output
net = AlexNet()
if __name__ == '__main__':
print(net)
3 总结
这里我们给出了LeNet和AlexNet的基本结构和代码实现。
4 参考
- 动手学深度学习—Pytorch版