如何制作nuscenes数据集制作minist的数据集

转载

mob64ca1402a190 2024-05-03 07:07:42

文章标签 如何制作nuscenes数据集深度学习 caffe 实战数据集 文章分类 架构后端开发

文章目录

背景
实战准备
实战1

直接用caffe准备脚本进行训练
源码分析（查看一下脚本源码）

get_mnist.sh
create_mnist.sh
lmdb数据集查看
train_lenet.sh
lenet_train_test.prototxt

实战2

使用python训练Lenet

caffe的example的mnist文档导读（就是翻译一下）

使用caffe在mnist数据集训练lenet

准备数据集
lenet：mnist数据集的分类模型（基本上是最早的神经网络模型了）
定义一个mnist网络
写一个数据层
写一个卷积层
写一个池化层
写一个全连接层
写一个relu激活层
写神经层的规则

背景

最近在学习ncnn，而ncnn对caffe的模型支持的很好，我们就先从caffe开始。
我的其他的博客已经写到了caffe的安装过程，下面我们来用caffe来做一个Hello world级别的实战，即mnist数据集。

实战准备

安装Ubuntu系统
编译caffe
编译pycaffe

如何制作nuscenes数据集制作minist的数据集_caffe

实战1

直接用caffe准备脚本进行训练

cd caffe-1.0
# 下载数据集
./data/mnist/get_mnist.sh

如何制作nuscenes数据集制作minist的数据集_实战_02

如何制作nuscenes数据集制作minist的数据集_深度学习_03

# 转化数据集格式到lmb
./example/mnist/create_mnist.sh

如何制作nuscenes数据集制作minist的数据集_如何制作nuscenes数据集_04

# 改变训练模式使用cpu
vi example/mnist/lenet_solver.prototext
改变最后一行的solver_mode为CPU（否则没有安装gpu的机器将不能进行训练）

# 开始训练
example/mnist/train_lenet.sh

如何制作nuscenes数据集制作minist的数据集_caffe_05

# Iteration（迭代次数） loss（当前损失值）
0806 15:38:46.031091  3219 solver.cpp:218] Iteration 9300 (13.1579 iter/s, 7.6s/100 iters), loss = 0.00583307
I0806 15:38:46.031162  3219 solver.cpp:237]     Train net output #0: loss = 0.00583305 (* 1 = 0.00583305 loss)
# Iteration（迭代次数） lr（学习率）
I0806 15:38:46.031173  3219 sgd_solver.cpp:105] Iteration 9300, lr = 0.00610706
I0806 15:38:51.319743  3223 data_layer.cpp:73] Restarting data prefetching from start.
I0806 15:38:53.602530  3219 solver.cpp:218] Iteration 9400 (13.2083 iter/s, 7.571s/100 iters), loss = 0.0199533
I0806 15:38:53.602596  3219 solver.cpp:237]     Train net output #0: loss = 0.0199533 (* 1 = 0.0199533 loss)
I0806 15:38:53.602607  3219 sgd_solver.cpp:105] Iteration 9400, lr = 0.00608343
I0806 15:39:01.034157  3219 solver.cpp:330] Iteration 9500, Testing net (#0)
I0806 15:39:05.505321  3224 data_layer.cpp:73] Restarting data prefetching from start.
I0806 15:39:05.685834  3219 solver.cpp:397]     Test net output #0: accuracy = 0.9891
I0806 15:39:05.685916  3219 solver.cpp:397]     Test net output #1: loss = 0.0344745 (* 1 = 0.0344745 loss)
I0806 15:39:05.759836  3219 solver.cpp:218] Iteration 9500 (8.22571 iter/s, 12.157s/100 iters), loss = 0.0026411
I0806 15:39:05.759903  3219 solver.cpp:237]     Train net output #0: loss = 0.00264109 (* 1 = 0.00264109 loss)
I0806 15:39:05.759914  3219 sgd_solver.cpp:105] Iteration 9500, lr = 0.00606002
I0806 15:39:13.262712  3219 solver.cpp:218] Iteration 9600 (13.3298 iter/s, 7.502s/100 iters), loss = 0.00266751
I0806 15:39:13.262789  3219 solver.cpp:237]     Train net output #0: loss = 0.0026675 (* 1 = 0.0026675 loss)
I0806 15:39:13.262800  3219 sgd_solver.cpp:105] Iteration 9600, lr = 0.00603682
I0806 15:39:20.686713  3219 solver.cpp:218] Iteration 9700 (13.4716 iter/s, 7.423s/100 iters), loss = 0.00365233
I0806 15:39:20.686792  3219 solver.cpp:237]     Train net output #0: loss = 0.00365232 (* 1 = 0.00365232 loss)
I0806 15:39:20.686803  3219 sgd_solver.cpp:105] Iteration 9700, lr = 0.00601382
I0806 15:39:28.188238  3219 solver.cpp:218] Iteration 9800 (13.3316 iter/s, 7.501s/100 iters), loss = 0.0165108
I0806 15:39:28.188352  3219 solver.cpp:237]     Train net output #0: loss = 0.0165108 (* 1 = 0.0165108 loss)
I0806 15:39:28.188362  3219 sgd_solver.cpp:105] Iteration 9800, lr = 0.00599102
I0806 15:39:35.664347  3219 solver.cpp:218] Iteration 9900 (13.3779 iter/s, 7.475s/100 iters), loss = 0.00303775
I0806 15:39:35.664424  3219 solver.cpp:237]     Train net output #0: loss = 0.00303773 (* 1 = 0.00303773 loss)
I0806 15:39:35.664435  3219 sgd_solver.cpp:105] Iteration 9900, lr = 0.00596843
I0806 15:39:43.017205  3219 solver.cpp:447] Snapshotting to binary proto file examples/mnist/lenet_iter_10000.caffemodel
# 10000次的时候保存了现场
I0806 15:39:43.025692  3219 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_10000.solverstate
# 最终的loss值
I0806 15:39:43.063850  3219 solver.cpp:310] Iteration 10000, loss = 0.00397047
I0806 15:39:43.063910  3219 solver.cpp:330] Iteration 10000, Testing net (#0)
I0806 15:39:47.536815  3224 data_layer.cpp:73] Restarting data prefetching from start.
# 最终的准确率
I0806 15:39:47.723743  3219 solver.cpp:397]     Test net output #0: accuracy = 0.991
# 最终的损失值
I0806 15:39:47.723821  3219 solver.cpp:397]     Test net output #1: loss = 0.0291581 (* 1 = 0.0291581 loss)
I0806 15:39:47.723832  3219 solver.cpp:315] Optimization Done.
I0806 15:39:47.723839  3219 caffe.cpp:259] Optimization Done.

以上为训练结果，下面我们跑一下测试结果

./build/tools/caffe test -model examples/mnist/lenet_train_test.prototxt -weights examples/mnist/lenet_iter_10000.caffemodel -iterations 100

如何制作nuscenes数据集制作minist的数据集_数据集_06

最终在测试集上的准确率是99.1%

源码分析（查看一下脚本源码）

get_mnist.sh

#!/usr/bin/env sh
# This scripts downloads the mnist data and unzips it.

DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd "$DIR"

echo "Downloading..."

for fname in train-images-idx3-ubyte train-labels-idx1-ubyte t10k-images-idx3-ubyte t10k-labels-idx1-ubyte
do
    if [ ! -e $fname ]; then
    	# 直接wget对应的数据，没有进行任何的转化
        wget --no-check-certificate http://yann.lecun.com/exdb/mnist/${fname}.gz
        gunzip ${fname}.gz
    fi
done

create_mnist.sh

#!/usr/bin/env sh
# This script converts the mnist data into lmdb/leveldb format,
# depending on the value assigned to $BACKEND.
set -e

EXAMPLE=examples/mnist
DATA=data/mnist
BUILD=build/examples/mnist

BACKEND="lmdb"

echo "Creating ${BACKEND}..."

# 转换前删除
rm -rf $EXAMPLE/mnist_train_${BACKEND}
rm -rf $EXAMPLE/mnist_test_${BACKEND}

# 转换train的数据
$BUILD/convert_mnist_data.bin $DATA/train-images-idx3-ubyte \
  $DATA/train-labels-idx1-ubyte $EXAMPLE/mnist_train_${BACKEND} --backend=${BACKEND}
# 转换test的数据
$BUILD/convert_mnist_data.bin $DATA/t10k-images-idx3-ubyte \
  $DATA/t10k-labels-idx1-ubyte $EXAMPLE/mnist_test_${BACKEND} --backend=${BACKEND}
  
echo "Done."

lmdb数据集查看

# coding: utf8

import lmdb
from caffe.proto import caffe_pb2
import caffe
import numpy as np
import matplotlib.pyplot as plt
import os

lmdb_test_path = r'/home/wang/work_code/caffe-1.0/examples/mnist/mnist_test_lmdb'

save_dir = r'/home/wang/work_code/mnist_image'

# 打开lmdb数据库
env = lmdb.open(lmdb_test_path)
# 获得一个连接（暂时可以县这么理解）
conn = env.begin()
# 获得遍历的游标
cur = conn.cursor()
# 获得caffe封装lmdb的结构
datum = caffe_pb2.Datum()
# 遍历游标
for k, v in cur:
    print k
    # 用caffe的结构加载数据
    datum.ParseFromString(v)
    # 获取原始数据的label
    print datum.label
    # 把datum格式的数据转换成为array格式（np的array）
    data = caffe.io.datum_to_array(datum)
    # 看一下原有的数据的shape（1，28，28）
    print np.shape(data)
    # 把数据reshape成plt可以展示的shape
    data = np.reshape(data, (28, -1))
    # 打印
    print np.shape(data)
    # plt展示
    plt.imshow(data)
    plt.show()
    # plt保存下这个图
    plt.imsave(os.path.join(save_dir, '{}.png'.format(k)), data)

cur.close()

train_lenet.sh

#!/usr/bin/env sh
set -e
# 调用train方法，传入对应的solver参数（下面我们看一下lenet_solver.prototxt的配置代码）
./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt $@

lenet_solver.prototxt

# The train/test net protocol buffer definition
# 指定网络，如果没有特殊指定，那么训练网络和测试网络用一套（以下我们会看一下lenet_train_test.prototxt的配置代码）
net: "examples/mnist/lenet_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
# 测试的时候进行的迭代次数
test_iter: 100
# Carry out testing every 500 training iterations.
# 测试间隔
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
# 初始学习率
base_lr: 0.01
# 学习率变更策略的动量值
momentum: 0.9
# 权重衰减率（防止过拟合）
weight_decay: 0.0005
# The learning rate policy
# 学习率变更策略
lr_policy: "inv"
# 学习率变更策略需要用到的参数
gamma: 0.0001
# 学习率变更策略需要用到的参数
power: 0.75
# Display every 100 iterations
# 展现和打印参数周期
display: 100
# The maximum number of iterations
# 模型最大迭代次数
max_iter: 10000
# snapshot intermediate results
# 快照保存周期
snapshot: 5000
# 快照保存名前缀
snapshot_prefix: "examples/mnist/lenet"
# solver mode: CPU or GPU
# 训练模式（如果没有GPU的就只能用CPU了）
solver_mode: GPU

lenet_train_test.prototxt

下面是全部的代码

name: "LeNet"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  // 适用于的环境
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

下面我们来分开来解析一下。

name，神经网络名称（自取）
layer，声明一个神经层
name，神经层名字（自取）
type，神经层类型

如果是Data，表示数据来源于LMDB或者LevelDB，而且必须设置batch_size，source为包含数据库的路径
数据层

MemoryData，数据来源于内存，必须设置batch_size，channels，width，height
HDF5Data，来源于hdf5，必须设置batch_size和source
ImageData，数据来源于图片，必须设置source，batch_size
WindowData，数据来源于windows

视觉层

Convolution，卷积层类型
Pooling，池化层类型
ReLU，激活层
SoftmaxWithLoss，激活层
InnerProduct，全连接层
其它层，Accuracy，Reshape，Dropout

prototxt的配置参考：

实战2

使用python训练Lenet

git clone https://gitee.com/simple_projects/caffe_learning.git

找到01-learning-lenet-mine.ipynb。
这里通过python训练lenet，过程详尽，而且配有注释（注释是我自己的理解，不一定全对。。）

caffe的example的mnist文档导读（就是翻译一下）

地址：https://github.com/BVLC/caffe/blob/04ab089db018a292ae48d51732dd6c66766b36b6/examples/mnist/readme.md

使用caffe在mnist数据集训练lenet

首先，caffe的环境必须得编译好，否则请先编译caffe环境，并且设置CAFFE_ROOT环境变量。

准备数据集

需要先下载mnist数据集：

cd $CAFFE_ROOT
./data/mnist/get_mnist.sh
./examples/mnist/create_mnist.sh

如果提示没有安装wget或者gunzip，需要提前安装。脚本运行后活出现两个数据集mnist_train_lmdb， mnist_test_lmdb

lenet：mnist数据集的分类模型（基本上是最早的神经网络模型了）

在运行训练程序之前，我们来看看即将发生什么。我们讲使用lenet，此网络在手写数字分类上表现很好。我们讲使用与最初的Lenet实现略有不同的版本，把sigmoid激活函数转换为relu网络。

Lenet的设计包含了cnn的精髓，cnn网络仍用于很多大规模的神经网络，例如ImageNet。通常，它包括第一个卷积层和池化层，第二个卷积层池化层，然后是两个全连接层（即多层感知机）。这些都写到了$CAFFE_ROOT/examples/mnist/lenet_train_test.prototxt。

定义一个mnist网络

本节诠释了用于mnist手写数字分类的lenet模型的lenet_train_test.prototxt文件定义。我们假设你已经熟悉Google的protobuf，并假定你已经阅读了caffe使用的probuf定义，这些定义在$caffe_root/src/caffe/proto/caffe.proto中找到。

具体来说，我们将编写一个caffe:NetParameter（或者在python中，caffe.proto.caffe_pb2.netdata）原型。先来看一下网络命名。

name: "LeNet"

写一个数据层

现在，我们要从lmdb文件中读取mnist数据集。这些要在数据层（神经网络每个层都有可能是不同类型的）中定义：

layer {
  name: "mnist"
  type: "Data"
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "mnist_train_lmdb"
    backend: LMDB
    batch_size: 64
  }
  top: "data"
  top: "label"
}

特别注意，本层命名为mnist，类型为data，而且是从lmdb中读取数据，即source指定。每个batch的size为64，并且缩放他们到[0，1)的范围内。为什么是0.00390625呢？因为它正好是1/256。最后本层生成两个blobs，一个是data，一个是label。

写一个卷积层

layer {
  name: "conv1"
  type: "Convolution"
  param { lr_mult: 1 }
  param { lr_mult: 2 }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
  bottom: "data"
  top: "conv1"
}

该层接收data的blob（由data的layer提供），并生成卷积层。本层生成20维的输出通道，卷积核大小为5，步长为1。

这里可以使用随机初始化权重和偏差的值。权重的填充将使用Xavier算法，本算法根据输入和输出神经元的数量自动确定初始化的规模。对于偏置，将简单的用0来填充。

lr_mults是对层的学习率调整参数。这里，我们将权重学习率设置为与求解器在运行时的学习率相同，而偏差学习率将是该速度的两倍。这样将收敛更快（这里不太明白。。。。。！！）

写一个池化层

layer {
  name: "pool1"
  type: "Pooling"
  pooling_param {
    kernel_size: 2
    stride: 2
    pool: MAX
  }
  bottom: "conv1"
  top: "pool1"
}

这里将用你2x2的池化核心，步长为2（因此相邻池化区没有重叠。。不过这里不太懂。。。）

同上，将写第二套卷积池化层，详见$CAFFE_ROOT/examples/mnist/lenet_train_test.prototxt

写一个全连接层

layer {
  name: "ip1"
  type: "InnerProduct"
  param { lr_mult: 1 }
  param { lr_mult: 2 }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
  bottom: "pool2"
  top: "ip1"
}

这里定义了一个全连接层（即InnerProduct），有500个输出。其他的看着都很像

写一个relu激活层

layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}

由于relu是一个按元素进行的操作，我们可以进行in-place操作，省点内存。这里是通过设置相同的bottom和top来实现的。注意，这个操作不能用于其他类型的层。

relu激活层之后，还需要另外一个全连接层

layer {
  name: "ip2"
  type: "InnerProduct"
  param { lr_mult: 1 }
  param { lr_mult: 2 }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
  bottom: "ip1"
  top: "ip2"
}

写一个损失层（损失层主要用于参数优化）

layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
}

softmax_loss层同时实现softmax和多项式逻辑损失两个操作（这样省时间）。这里需要两个数据blobs，第一个是预测，第二个是添加标签。这些都是由数据层提供。这里不产生输出，他所做的只有计算损失值，并进行反向传播，进行参数优化。这里就是创造奇迹的地方（奇迹就是参数的优化）。

写神经层的规则

layer {
  // ...layer definition...
  include: { phase: TRAIN }
}

用于指定给什么环境去使用。

以下就略了吧。。

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：浏览器镜像浏览器镜像是什么

下一篇：显式三维重建神经网络深度三维重建

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯