目录

  • 前言
  • 1. Introduction(介绍)
  • 2. Related Work(相关工作)
  • 2.1 Analyzing importance of depth(分析网络深度的重要性)
  • 2.2 Scaling DNNs(深度神经网络的尺寸)
  • 2.3 Shallow networks(浅层网络)
  • 2.4 Multi-stream networks(多尺寸流的网络)
  • 3. METHOD(网络设计方法)
  • 3.1 PARNET BLOCK
  • 3.2 DOWNSAMPLING AND FUSION BLOCK
  • 3.3 NETWORK ARCHITECTURE
  • 4. RESULTS(结果展示)
  • 代码演示
  • 1. 导入库
  • 2. 设置超参数
  • 3. 数据预处理
  • 4. 构建ParNet
  • 5. 设置损失函数和优化器
  • 6. 训练模型
  • 7. 预测图片


前言

深度是深度神经网络的标志,但深度越大意味着顺序计算越多延迟也越大。这就引出了一个问题——是否有可能构建高性能的“非深度”神经网络?作者实现了一个12层的网络结构实现了top-1 accuracy over 80%on ImageNet的效果。分析网络设计的伸缩规则,并展示如何在不改变网络深度的情况下提高性能。

下面我们就看看作者在论文中是怎么说的吧!

论文地址:https://arxiv.org/abs/2110.07641

1. Introduction(介绍)

人们普遍认为,大深度是高性能网络的重要组成部分,因为深度增加了网络的表征能力,并有助于学习越来越抽象的特征。但是大深度总是必要的吗?这个问题值得一问,因为大深度并非没有缺点。更深层次的网络会导致更多的顺序处理和更高的延迟;它很难并行化,也不太适合需要快速响应的应用程序。


为此,作者进行了研究提出了ParNet。ParNet可以被有效的并行化,并且在速度和准确性上都优于Resnet。注意,尽管处理单元之间的通信带来了额外的延迟,但还是实现了这一点。如果可以进一步减少通信延迟,类似parnet的体系结构可以用于创建非常快速的识别系统。


不仅如此,ParNet可以通过增加宽度、分辨率和分支数量来有效缩放,同时保持深度不变。作者观察到ParNet的性能并没有饱和,而是随着计算吞吐量的增加而增加。这表明,通过进一步增加计算,可以实现更高的性能,同时保持较小的深度(~ 10)和低延迟。

下图是论文中ParNet与其它网络的比较。

分别猫狗的python代码 pytorch 猫狗_神经网络


论文作者的贡献:

  1. 首次证明,深度仅为12的神经网络可以在非常有竞争力的基准测试中取得高性能(ImageNet上80.7%)
  2. 展示了如何利用ParNet中的并行结构进行快速、低延迟的推断
  3. 研究了ParNet的缩放规则,并证明了恒定的低深度下的有效缩放

2. Related Work(相关工作)

2.1 Analyzing importance of depth(分析网络深度的重要性)

已有大量的研究证实了深层网络的优点,具有sigmoid激活的单层神经网络可以以任意小的误差近似任何函数,但是需要使用具有足够大宽度的网络。而要近似函数,具有非线性的深度网络需要的参数要比浅层网络所需要的参数少,而且在固定的预算参数下,深度网络的性能优于浅层网络,这通常被认为是大深度的主要优势之一。

但是在这样的分析中,先前的工作只研究了线性顺序结构的浅层网络,不清楚这个结论是否仍然适用于其他设计。在这项工作中,作者表明浅层网络也可以表现得非常好,但关键是要有并行的子结构

2.2 Scaling DNNs(深度神经网络的尺寸)

有研究表明,增加深度、宽度和分辨率会导致卷积网络的有效缩放。我们也研究标度规则,但重点关注低深度的机制。我们发现,可以通过增加分支的数量、宽度和分辨率来有效地扩展ParNet,同时保持深度不变和较低。

2.3 Shallow networks(浅层网络)

浅网络在理论机器学习中引起了广泛的关注。在无限宽的情况下,单层神经网络的行为类似于高斯过程,可以用核方法来理解训练过程。然而,与最先进的网络相比,这些模型没有竞争力,我们提供了经验证明,非深度网络可以与深度网络竞争。

2.4 Multi-stream networks(多尺寸流的网络)

多流神经网络已被用于各种计算机视觉任务,如分割、检测、视频分类,我们也使用不同分辨率的流,但我们的网络要低得多,并且流在最后只融合一次,使并行化更容易。

3. METHOD(网络设计方法)

3.1 PARNET BLOCK

在RepVGG中提出了结构重参数化的思想,简单来说就是可以将3x3卷积,1x1卷积两个分支通过代数的处理变成另外的一个3x3的卷积操作。


作者就是借鉴了Rep-VGG的初始块设计,并对其进行修改,使其更适合的非深度架构。但一个只有3×3卷积的非深度网络的挑战是感受野相当有限。为此,作者对结构进行了改进,如图所示:

分别猫狗的python代码 pytorch 猫狗_深度学习_02


作者将上图的block称为RepVGG-SSE。

因为ImageNet这样的大规模数据集,非深度网络可能没有足够的非线性,限制了它的表征能力。因此,作者用SiLU代替ReLU激活。

代码如下:

class SSEBlock(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(SSEBlock, self).__init__()

        self.in_channels, self.out_channels = in_channels, out_channels
        self.norm = nn.BatchNorm2d(self.in_channels)
        self.globalAvgPool = GlobalAveragePool2D()
        self.conv = nn.Conv2d(self.in_channels, self.out_channels, kernel_size=(1, 1))
        self.sigmoid = nn.Sigmoid()

    def forward(self, inputs):
        bn = self.norm(inputs)
        x = self.globalAvgPool(bn)
        x = self.conv(x)
        x = self.sigmoid(x)

        z = torch.mul(bn, x)
        return z
class FuseBlock(nn.Module):
    def __init__(self, in_channels, out_channels) -> None:
        super().__init__()
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.conv1 = conv_bn(self.in_channels, self.out_channels, kernel_size=1)
        self.conv2 = conv_bn(self.in_channels, self.out_channels, kernel_size=3, stride=1)

    def forward(self, inputs):
        a = self.conv1(inputs)
        b = self.conv2(inputs)

        c = a + b
        return c
class Stream(nn.Module):
    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.sse = nn.Sequential(SSEBlock(self.in_channels, self.out_channels))
        self.fuse = nn.Sequential(FuseBlock(self.in_channels, self.out_channels))
        self.act = nn.SiLU(inplace=True)

    def forward(self, inputs):
        a = self.sse(inputs)
        b = self.fuse(inputs)
        c = a + b

        d = self.act(c)
        return d

3.2 DOWNSAMPLING AND FUSION BLOCK

RepVGG-SSE block的输入与输出的大小是相同的,此外,ParNet结构中还有Downsampling block与fusion block。
Downsampling block的作用是降低分辨率,增加宽度,以实现多尺度处理。fusion block的作用是合并来自多个分辨率的信息。
具体如下:

  1. 在降采样 block 中添加了一个与卷积层并行的单层 SE 模块。
  2. 在 1×1 卷积分支中添加了 2D 平均池化。
  3. 融合 block 额外包含了一个串联(concatenation)层。由于串联,融合 block 的输入通道数是降采样 block 的两倍。
    具体结构如图所示:

    左图是Fusion,右图是Downsampling_block
    代码如下:
class Fusion(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(Fusion, self).__init__()
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.mid_channels = 2 * self.in_channels
        self.avgpool = nn.AvgPool2d(kernel_size=(2, 2))
        self.conv1 = conv_bn(self.mid_channels, self.out_channels, kernel_size=1, stride=1, groups=2)
        self.conv2 = conv_bn(self.mid_channels, self.out_channels, kernel_size=3, stride=2, groups=2)
        self.conv3 = nn.Conv2d(in_channels=self.mid_channels, out_channels=self.out_channels, kernel_size=1, groups=2)
        self.globalAvgPool = GlobalAveragePool2D()
        self.act = nn.SiLU(inplace=True)
        self.sigmoid = nn.Sigmoid()
        self.bn = nn.BatchNorm2d(self.in_channels)
        self.group = in_channels

    def channel_shuffle(self, x):
        batchsize, num_channels, height, width = x.data.size()
        assert num_channels % self.group == 0
        group_channels = num_channels // self.group

        x = x.reshape(batchsize, group_channels, self.group, height, width)
        x = x.permute(0, 2, 1, 3, 4)
        x = x.reshape(batchsize, num_channels, height, width)

        return x

    def forward(self, input1, input2):

        a = torch.cat([self.bn(input1), self.bn(input2)], dim=1)

        a = self.channel_shuffle(a)

        x = self.avgpool(a)

        x = self.conv1(x)

        y = self.conv2(a)

        z = self.globalAvgPool(a)

        z = self.conv3(z)
        z = self.sigmoid(z)

        a = x + y

        b = torch.mul(a, z)
        out = self.act(b)
        return out
 class Downsampling_block(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(Downsampling_block, self).__init__()
        self.in_channels, self.out_channels = in_channels, out_channels

        self.avgpool = nn.AvgPool2d(kernel_size=(2, 2))
        self.conv1 = conv_bn(self.in_channels, self.out_channels, kernel_size=1)
        self.conv2 = conv_bn(self.in_channels, self.out_channels, kernel_size=3, stride=2)
        self.conv3 = nn.Conv2d(in_channels=self.in_channels, out_channels=self.out_channels, kernel_size=1)
        self.globalAvgPool = GlobalAveragePool2D()
        self.act = nn.SiLU(inplace=True)
        self.sigmoid = nn.Sigmoid()

    def forward(self, inputs):
        x = self.avgpool(inputs)
        x = self.conv1(x)

        y = self.conv2(inputs)

        z = self.globalAvgPool(inputs)
        z = self.conv3(z)
        z = self.sigmoid(z)

        a = x + y
        b = torch.mul(a, z)
        out = self.act(b)
        return out

3.3 NETWORK ARCHITECTURE

ParNet架构示意图如下:

分别猫狗的python代码 pytorch 猫狗_pytorch_03


网络结构如下:

分别猫狗的python代码 pytorch 猫狗_分别猫狗的python代码_04

4. RESULTS(结果展示)

分别猫狗的python代码 pytorch 猫狗_深度学习_05


分别猫狗的python代码 pytorch 猫狗_深度学习_06


感谢博主:

代码演示

参考代码:https://github.com/murufeng/awesome_lightweight_networks/blob/main/light_cnns/mobile_real_time_network/parnet.py
数据集下载:
链接:https://pan.baidu.com/s/1zs9U76OmGAIwbYr91KQxgg
提取码:bhjx

新建train.py文件

1. 导入库

import torch,os
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader, Dataset
from torchvision import models, transforms
from PIL import Image
from torch.autograd import Variable

2. 设置超参数

EPOCH = 100
IMG_SIZE = 256
BATCH_SIZE= 6
IMG_MEAN = [0.485, 0.456, 0.406]
IMG_STD = [0.229, 0.224, 0.225]
CUDA=torch.cuda.is_available()
DEVICE = torch.device("cuda" if CUDA else "cpu")
train_path = './data1_dog_cat/train'
test_path = './data1_dog_cat/test'
classes_name = os.listdir(train_path)

3. 数据预处理

train_transforms = transforms.Compose([
    transforms.Resize(IMG_SIZE),
    transforms.RandomResizedCrop(IMG_SIZE),
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(30),
    transforms.ToTensor(),
    transforms.Normalize(IMG_MEAN, IMG_STD)
])

val_transforms = transforms.Compose([
    transforms.Resize(IMG_SIZE),
    transforms.CenterCrop(IMG_SIZE),
    transforms.ToTensor(),
    transforms.Normalize(IMG_MEAN, IMG_STD)
])

class DogDataset(Dataset):
    def __init__(self, paths, classes_name, transform=None):
        self.paths = self.make_path(paths, classes_name)
        self.transform = transform

        
    def __len__(self):
        return len(self.paths)
    
    def __getitem__(self, idx):
        image = self.paths[idx].split(';')[0]
        img = Image.open(image)
        label = self.paths[idx].split(';')[1]
        if self.transform:
            img = self.transform(img)
        return img, int(label)
        
    def make_path(self, path, classes_name):
        # path: ./data1_dog_cat/train
        # path = './data1_dog_cat/train'
        path_list = []
        
        for class_name in classes_name:
            names = os.listdir(path + '/' +class_name)
            for name in names:
                p = os.path.join(path + '/' + class_name, name)
                label = str(classes_name.index(class_name))
                path_list.append(p+';'+label)
        return path_list
        


train_dataset = DogDataset(train_path, classes_name, train_transforms)
val_dataset = DogDataset(test_path, classes_name, val_transforms)
image_dataset = {'train':train_dataset, 'valid':val_dataset}

image_dataloader = {x:DataLoader(image_dataset[x],batch_size=BATCH_SIZE,shuffle=True) for x in ['train', 'valid']}
dataset_sizes = {x:len(image_dataset[x]) for x in ['train', 'valid']}

4. 构建ParNet

def conv_bn(in_channels,out_channels,kernel_size, stride=1, groups=1):
    return nn.Sequential(
        nn.Conv2d(in_channels=in_channels, out_channels=out_channels,kernel_size=kernel_size, stride=stride,
                  padding=kernel_size // 2, groups=groups, bias=False),
        nn.BatchNorm2d(out_channels)
    )

class GlobalAveragePool2D():
    def __init__(self, keepdim=True):
        self.keepdim = keepdim

    def __call__(self, inputs):
        return torch.mean(inputs, axis=[2, 3], keepdim=self.keepdim)


class SSEBlock(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(SSEBlock, self).__init__()

        self.in_channels, self.out_channels = in_channels, out_channels
        self.norm = nn.BatchNorm2d(self.in_channels)
        self.globalAvgPool = GlobalAveragePool2D()
        self.conv = nn.Conv2d(self.in_channels, self.out_channels, kernel_size=(1, 1))
        self.sigmoid = nn.Sigmoid()

    def forward(self, inputs):
        bn = self.norm(inputs)
        x = self.globalAvgPool(bn)
        x = self.conv(x)
        x = self.sigmoid(x)

        z = torch.mul(bn, x)
        return z

class Downsampling_block(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(Downsampling_block, self).__init__()
        self.in_channels, self.out_channels = in_channels, out_channels

        self.avgpool = nn.AvgPool2d(kernel_size=(2, 2))
        self.conv1 = conv_bn(self.in_channels, self.out_channels, kernel_size=1)
        self.conv2 = conv_bn(self.in_channels, self.out_channels, kernel_size=3, stride=2)
        self.conv3 = nn.Conv2d(in_channels=self.in_channels, out_channels=self.out_channels, kernel_size=1)
        self.globalAvgPool = GlobalAveragePool2D()
        self.act = nn.SiLU(inplace=True)
        self.sigmoid = nn.Sigmoid()

    def forward(self, inputs):
        x = self.avgpool(inputs)
        x = self.conv1(x)

        y = self.conv2(inputs)

        z = self.globalAvgPool(inputs)
        z = self.conv3(z)
        z = self.sigmoid(z)

        a = x + y
        b = torch.mul(a, z)
        out = self.act(b)
        return out

class Fusion(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(Fusion, self).__init__()
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.mid_channels = 2 * self.in_channels
        self.avgpool = nn.AvgPool2d(kernel_size=(2, 2))
        self.conv1 = conv_bn(self.mid_channels, self.out_channels, kernel_size=1, stride=1, groups=2)
        self.conv2 = conv_bn(self.mid_channels, self.out_channels, kernel_size=3, stride=2, groups=2)
        self.conv3 = nn.Conv2d(in_channels=self.mid_channels, out_channels=self.out_channels, kernel_size=1, groups=2)
        self.globalAvgPool = GlobalAveragePool2D()
        self.act = nn.SiLU(inplace=True)
        self.sigmoid = nn.Sigmoid()
        self.bn = nn.BatchNorm2d(self.in_channels)
        self.group = in_channels

    def channel_shuffle(self, x):
        batchsize, num_channels, height, width = x.data.size()
        assert num_channels % self.group == 0
        group_channels = num_channels // self.group

        x = x.reshape(batchsize, group_channels, self.group, height, width)
        x = x.permute(0, 2, 1, 3, 4)
        x = x.reshape(batchsize, num_channels, height, width)

        return x

    def forward(self, input1, input2):

        a = torch.cat([self.bn(input1), self.bn(input2)], dim=1)

        a = self.channel_shuffle(a)

        x = self.avgpool(a)

        x = self.conv1(x)

        y = self.conv2(a)

        z = self.globalAvgPool(a)

        z = self.conv3(z)
        z = self.sigmoid(z)

        a = x + y

        b = torch.mul(a, z)
        out = self.act(b)
        return out

class Stream(nn.Module):
    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.sse = nn.Sequential(SSEBlock(self.in_channels, self.out_channels))
        self.fuse = nn.Sequential(FuseBlock(self.in_channels, self.out_channels))
        self.act = nn.SiLU(inplace=True)

    def forward(self, inputs):
        a = self.sse(inputs)
        b = self.fuse(inputs)
        c = a + b

        d = self.act(c)
        return d


class FuseBlock(nn.Module):
    def __init__(self, in_channels, out_channels) -> None:
        super().__init__()
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.conv1 = conv_bn(self.in_channels, self.out_channels, kernel_size=1)
        self.conv2 = conv_bn(self.in_channels, self.out_channels, kernel_size=3, stride=1)

    def forward(self, inputs):
        a = self.conv1(inputs)
        b = self.conv2(inputs)

        c = a + b
        return c


class ParNetEncoder(nn.Module):
    def __init__(self, in_channels, block_channels, depth) -> None:
        super().__init__()
        self.in_channels = in_channels
        self.block_channels = block_channels
        self.depth = depth
        self.d1 = Downsampling_block(self.in_channels, self.block_channels[0])
        self.d2 = Downsampling_block(self.block_channels[0], self.block_channels[1])
        self.d3 = Downsampling_block(self.block_channels[1], self.block_channels[2])
        self.d4 = Downsampling_block(self.block_channels[2], self.block_channels[3])
        self.d5 = Downsampling_block(self.block_channels[3], self.block_channels[4])
        self.stream1 = nn.Sequential(
            *[Stream(self.block_channels[1], self.block_channels[1]) for _ in range(self.depth[0])]
        )

        self.stream1_downsample = Downsampling_block(self.block_channels[1], self.block_channels[2])

        self.stream2 = nn.Sequential(
            *[Stream(self.block_channels[2], self.block_channels[2]) for _ in range(self.depth[1])]
        )

        self.stream3 = nn.Sequential(
            *[Stream(self.block_channels[3], self.block_channels[3]) for _ in range(self.depth[2])]
        )

        self.stream2_fusion = Fusion(self.block_channels[2], self.block_channels[3])
        self.stream3_fusion = Fusion(self.block_channels[3], self.block_channels[3])

    def forward(self, inputs):
        x = self.d1(inputs)
        x = self.d2(x)

        y = self.stream1(x)
        y = self.stream1_downsample(y)

        x = self.d3(x)

        z = self.stream2(x)
        z = self.stream2_fusion(y, z)

        x = self.d4(x)

        a = self.stream3(x)
        b = self.stream3_fusion(z, a)

        x = self.d5(b)
        return x


class ParNetDecoder(nn.Module):
    def __init__(self, in_channels, n_classes) -> None:
        super().__init__()
        self.avg = nn.AdaptiveAvgPool2d((1, 1))
        self.decoder = nn.Linear(in_channels, n_classes)
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        x = self.avg(x)
        x = x.view(x.size(0), -1)
        x = self.decoder(x)
        return self.softmax(x)


class ParNet(nn.Module):
    def __init__(self, in_channels, n_classes, block_channels=[64, 128, 256, 512, 2048], depth=[4, 5, 5]) -> None:
        super().__init__()
        self.encoder = ParNetEncoder(in_channels, block_channels, depth)
        self.decoder = ParNetDecoder(block_channels[-1], n_classes)

    def forward(self, inputs):
        x = self.encoder(inputs)
        x = self.decoder(x)

        return x

def parnet_s(in_channels, n_classes):
    return ParNet(in_channels, n_classes, block_channels=[64, 96, 192, 384, 1280])


def parnet_m(in_channels, n_classes):
    model = ParNet(in_channels, n_classes, block_channels=[64, 128, 256, 512, 2048])
    return model


def parnet_l(in_channels, n_classes):
    return ParNet(in_channels, n_classes, block_channels=[64, 160, 320, 640, 2560])


def parnet_xl(in_channels, n_classes):
    return ParNet(in_channels, n_classes, block_channels=[64, 200, 400, 800, 3200])

model_ft = parnet_s(3, len(classes_name))
model_ft.to(DEVICE)
print(model_ft)

5. 设置损失函数和优化器

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model_ft.parameters(), lr=1e-3)#指定 新加的fc层的学习率

cosine_schedule = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer=optimizer,T_max=20,eta_min=1e-9)

6. 训练模型

训练的是parnet_s版本

def train(model, device, train_loader, optimizer, epoch):
    model.train()
    sum_loss = 0
    total_accuracy  = 0
    total_num = len(train_loader.dataset)
    print(total_num, len(train_loader))
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device, non_blocking=True), target.to(device, non_blocking=True)

        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
        lr = optimizer.state_dict()['param_groups'][0]['lr']
        print_loss = loss.data.item()
        sum_loss += print_loss
        accuracy = torch.mean((torch.argmax(F.softmax(output, dim=-1), dim=-1) == target).type(torch.FloatTensor))
        total_accuracy += accuracy.item()
        if (batch_idx + 1) % 10 == 0:
            ave_loss = sum_loss / (batch_idx+1)
            acc = total_accuracy / (batch_idx+1)
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}\tLR:{:.9f}'.format(
                epoch, (batch_idx + 1) * len(data), len(train_loader.dataset),
                       100. * (batch_idx + 1) / len(train_loader), loss.item(),lr))
    
            print('epoch:%d,loss:%.4f,train_acc:%.4f'%(epoch, ave_loss, acc))
    

ACC=0
# 验证过程
def val(model, device, test_loader):
    global ACC
    model.eval()
    test_loss = 0
    correct = 0
    total_num = len(test_loader.dataset)
    print(total_num, len(test_loader))
    with torch.no_grad():
        for data, target in test_loader:
            data, target = Variable(data).to(device), Variable(target).to(device)
            output = model(data)
            loss = criterion(output, target)
            _, pred = torch.max(output.data, 1)
            correct += torch.sum(pred == target)
            print_loss = loss.data.item()
            test_loss += print_loss
        correct = correct.data.item()
        acc = correct / total_num
        avgloss = test_loss / len(test_loader)
        print('\nVal set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
            avgloss, correct, len(test_loader.dataset), 100 * acc))
        if acc > ACC:
            torch.save(model_ft, 'model_' + 'epoch_' + str(epoch) + '_' + 'ACC-' + str(round(acc, 3)) + '.pth')
            ACC = acc


# 训练

for epoch in range(1, EPOCH + 1):
    train(model_ft, DEVICE, image_dataloader['train'], optimizer, epoch)
    cosine_schedule.step()
    val(model_ft, DEVICE, image_dataloader['valid'])

7. 预测图片

新建predict文件
注意输入图片路径和权重文件路径

import torch.utils.data.distributed
import torchvision.transforms as transforms
from PIL import Image, ImageFont, ImageDraw
from torch.autograd import Variable
import os
classes = ['cat', 'dog']
transform_test = transforms.Compose([
         transforms.Resize((256, 256)),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
 
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print('模型加载中!!!!!!!!!')
model = torch.load("./model.pth") # 模型路径
print('模型加载成功!!!!!!!!!')
model.eval()
model.to(DEVICE)
 
path= './data1_dog_cat/test/cat/cat.10000.jpg' # 预测图片路径
img = Image.open(path)
image = transform_test(img)
image.unsqueeze_(0)
image = Variable(image).to(DEVICE)
out=model(image)
_, pred = torch.max(out.data, 1)
# 在图上显示预测结果
draw = ImageDraw.Draw(img)
font = ImageFont.truetype("arial.ttf", 30) # 设置字体
content = classes[pred.data.item()]
draw.text((0, 0), content, font = font) 
img.show()

效果如下:

分别猫狗的python代码 pytorch 猫狗_深度学习_07