前言
网上其实有好多unet的教程,但是大多不支持多波段(遥感图像除了RGB波段还有红外等其他波段),多类别的话标签做onehot编码的时候类别颜色要手动输入。针对这两个问题,今天写下这篇文字。
有问题欢迎留言评论,觉得不错可以动动手指点个赞同&喜欢
如果我们的keras环境还没有搭建好,请先移步我下面这个文字:
馨意:深度学习Win10_Keras_TensorFlow-GPU环境搭建
如果我们还没有制作标签,请先移步我下面这个文字:
馨意:利用Arcgis制作遥感图像深度学习语义分割标签
如果我们的图像还没有裁剪成深度学习样本,请先移步我下面这个文字:
馨意:python遥感图像裁剪成深度学习样本_支持多波段
如果我们的样本还没有数据增强,请先移步我下面这个文字:
馨意:numpy实现深度学习遥感图像语义分割数据增强(支持多波段)
目录
- 读取图像数据
- 图像预处理
- Unet模型编写
- 模型训练
- 绘制loss/acc曲线图
- 模型预测
- 遥感图像大图像预测
- 精度评定
- 按文件分列全部代码
正文
1 读取图像数据
首先,我们要读取图像的像素矩阵,这里为了能支持多波段,我们利用GDAL读取:
import gdal
# 读取图像像素矩阵
# fileName 图像文件名
def readTif(fileName):
dataset = gdal.Open(fileName)
width = dataset.RasterXSize
height = dataset.RasterYSize
GdalImg_data = dataset.ReadAsArray(0, 0, width, height)
return GdalImg_data
2 图像预处理
读取了图像之后就要做预处理了:
- 对图像进行归一化,这里我们采用最大值归一化,即除以最大值255(对于8bit数据来说);
- 对标签进行onehot编码,即将平面的label的每类都单独变成由0和1组成的一层。
# 数据预处理:图像归一化+标签onehot编码
# img 图像数据
# label 标签数据
# classNum 类别总数(含背景)
# colorDict_GRAY 颜色字典
def dataPreprocess(img, label, classNum, colorDict_GRAY):
# 归一化
img = img / 255.0
for i in range(colorDict_GRAY.shape[0]):
label[label == colorDict_GRAY[i][0]] = i
# 将数据厚度扩展到classNum(包括背景)层
new_label = np.zeros(label.shape + (classNum,))
# 将平面的label的每类,都单独变成一层
for i in range(classNum):
new_label[label == i,i] = 1
label = new_label
return (img, label)
我们发现函数里有个colorDict_GRAY参数,这个是我们的各类别的颜色字典,比如我们如果只有两类的话,colorDict_GRAY = (0, 255)。如果类别多了我们还要手动输入比较麻烦,这里我们采用程序自动获取colorDict_GRAY:
# 获取颜色字典
# labelFolder 标签文件夹,之所以遍历文件夹是因为一张标签可能不包含所有类别颜色
# classNum 类别总数(含背景)
def color_dict(labelFolder, classNum):
colorDict = []
# 获取文件夹内的文件名
ImageNameList = os.listdir(labelFolder)
for i in range(len(ImageNameList)):
ImagePath = labelFolder + "/" + ImageNameList[i]
img = cv2.imread(ImagePath).astype(np.uint32)
# 如果是灰度,转成RGB
if(len(img.shape) == 2):
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB).astype(np.uint32)
# 为了提取唯一值,将RGB转成一个数
img_new = img[:,:,0] * 1000000 + img[:,:,1] * 1000 + img[:,:,2]
unique = np.unique(img_new)
# 将第i个像素矩阵的唯一值添加到colorDict中
for j in range(unique.shape[0]):
colorDict.append(unique[j])
# 对目前i个像素矩阵里的唯一值再取唯一值
colorDict = sorted(set(colorDict))
# 若唯一值数目等于总类数(包括背景)ClassNum,停止遍历剩余的图像
if(len(colorDict) == classNum):
break
# 存储颜色的RGB字典,用于预测时的渲染结果
colorDict_RGB = []
for k in range(len(colorDict)):
# 对没有达到九位数字的结果进行左边补零(eg:5,201,111->005,201,111)
color = str(colorDict[k]).rjust(9, '0')
# 前3位R,中3位G,后3位B
color_RGB = [int(color[0 : 3]), int(color[3 : 6]), int(color[6 : 9])]
colorDict_RGB.append(color_RGB)
# 转为numpy格式
colorDict_RGB = np.array(colorDict_RGB)
# 存储颜色的GRAY字典,用于预处理时的onehot编码
colorDict_GRAY = colorDict_RGB.reshape((colorDict_RGB.shape[0], 1 ,colorDict_RGB.shape[1])).astype(np.uint8)
colorDict_GRAY = cv2.cvtColor(colorDict_GRAY, cv2.COLOR_BGR2GRAY)
return colorDict_RGB, colorDict_GRAY
color_dict函数除了返回colorDict_GRAY,还会返回colorDict_RGB,用于预测时RGB渲染显示。
我们利用keras.Model.fit_generator()函数进行训练,所以我们需要一个训练数据生成器,keras自带的生成器不支持多波段,所以我们自己编写实现:
# 训练数据生成器
# batch_size 批大小
# train_image_path 训练图像路径
# train_label_path 训练标签路径
# classNum 类别总数(含背景)
# colorDict_GRAY 颜色字典
# resize_shape resize大小
def trainGenerator(batch_size, train_image_path, train_label_path, classNum, colorDict_GRAY, resize_shape = None):
imageList = os.listdir(train_image_path)
labelList = os.listdir(train_label_path)
img = readTif(train_image_path + "" + imageList[0])
# GDAL读数据是(BandNum,Width,Height)要转换为->(Width,Height,BandNum)
img = img.swapaxes(1, 0)
img = img.swapaxes(1, 2)
# 无限生成数据
while(True):
img_generator = np.zeros((batch_size, img.shape[0], img.shape[1], img.shape[2]), np.uint8)
label_generator = np.zeros((batch_size, img.shape[0], img.shape[1]), np.uint8)
if(resize_shape != None):
img_generator = np.zeros((batch_size, resize_shape[0], resize_shape[1], resize_shape[2]), np.uint8)
label_generator = np.zeros((batch_size, resize_shape[0], resize_shape[1]), np.uint8)
# 随机生成一个batch的起点
rand = random.randint(0, len(imageList) - batch_size)
for j in range(batch_size):
img = readTif(train_image_path + "" + imageList[rand + j])
img = img.swapaxes(1, 0)
img = img.swapaxes(1, 2)
# 改变图像尺寸至特定尺寸(
# 因为resize用的不多,我就用了OpenCV实现的,这个不支持多波段,需要的话可以用np进行resize
if(resize_shape != None):
img = cv2.resize(img, (resize_shape[0], resize_shape[1]))
img_generator[j] = img
label = readTif(train_label_path + "" + labelList[rand + j]).astype(np.uint8)
# 若为彩色,转为灰度
if(len(label.shape) == 3):
label = label.swapaxes(1, 0)
label = label.swapaxes(1, 2)
label = cv2.cvtColor(label, cv2.COLOR_RGB2GRAY)
if(resize_shape != None):
label = cv2.resize(label, (resize_shape[0], resize_shape[1]))
label_generator[j] = label
img_generator, label_generator = dataPreprocess(img_generator, label_generator, classNum, colorDict_GRAY)
yield (img_generator,label_generator)
3 Unet模型编写
这里我们对Unet添加BN层和Dropout层,优化器选用Adam,损失函数为交叉熵函数。利用keras编写实现:
from keras.models import Model
from keras.layers import Input, BatchNormalization, Conv2D, MaxPooling2D, Dropout, concatenate, merge, UpSampling2D
from keras.optimizers import Adam
def unet(pretrained_weights = None, input_size = (256, 256, 4), classNum = 2, learning_rate = 1e-5):
inputs = Input(input_size)
# 2D卷积层
conv1 = BatchNormalization()(Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs))
conv1 = BatchNormalization()(Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1))
# 对于空间数据的最大池化
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = BatchNormalization()(Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1))
conv2 = BatchNormalization()(Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2))
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = BatchNormalization()(Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2))
conv3 = BatchNormalization()(Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3))
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = BatchNormalization()(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3))
conv4 = BatchNormalization()(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4))
# Dropout正规化,防止过拟合
drop4 = Dropout(0.5)(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)
conv5 = BatchNormalization()(Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4))
conv5 = BatchNormalization()(Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5))
drop5 = Dropout(0.5)(conv5)
# 上采样之后再进行卷积,相当于转置卷积操作
up6 = Conv2D(512, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(drop5))
try:
merge6 = concatenate([drop4,up6],axis = 3)
except:
merge6 = merge([drop4,up6], mode = 'concat', concat_axis = 3)
conv6 = BatchNormalization()(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge6))
conv6 = BatchNormalization()(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv6))
up7 = Conv2D(256, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv6))
try:
merge7 = concatenate([conv3,up7],axis = 3)
except:
merge7 = merge([conv3,up7], mode = 'concat', concat_axis = 3)
conv7 = BatchNormalization()(Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge7))
conv7 = BatchNormalization()(Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv7))
up8 = Conv2D(128, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv7))
try:
merge8 = concatenate([conv2,up8],axis = 3)
except:
merge8 = merge([conv2,up8],mode = 'concat', concat_axis = 3)
conv8 = BatchNormalization()(Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge8))
conv8 = BatchNormalization()(Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv8))
up9 = Conv2D(64, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
try:
merge9 = concatenate([conv1,up9],axis = 3)
except:
merge9 = merge([conv1,up9],mode = 'concat', concat_axis = 3)
conv9 = BatchNormalization()(Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge9))
conv9 = BatchNormalization()(Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9))
conv9 = Conv2D(2, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
conv10 = Conv2D(classNum, 1, activation = 'softmax')(conv9)
model = Model(inputs = inputs, outputs = conv10)
# 用于配置训练模型(优化器、目标函数、模型评估标准)
model.compile(optimizer = Adam(lr = learning_rate), loss = 'categorical_crossentropy', metrics = ['accuracy'])
# 如果有预训练的权重
if(pretrained_weights):
model.load_weights(pretrained_weights)
return model
改进Unet
4 模型训练
至此,我们可以编写模型训练的代码了:
'''
数据集相关参数
'''
# 训练数据图像路径
train_image_path = "Datatrainimage"
# 训练数据标签路径
train_label_path = "Datatrainlabel"
# 验证数据图像路径
validation_image_path = "Datavalidationimage"
# 验证数据标签路径
validation_label_path = "Datavalidationlabel"
'''
模型相关参数
'''
# 批大小
batch_size = 2
# 类的数目(包括背景)
classNum = 2
# 模型输入图像大小
input_size = (512, 512, 3)
# 训练模型的迭代总轮数
epochs = 100
# 初始学习率
learning_rate = 1e-4
# 预训练模型地址
premodel_path = None
# 训练模型保存地址
model_path = "Modelunet_model.hdf5"
# 训练数据数目
train_num = len(os.listdir(train_image_path))
# 验证数据数目
validation_num = len(os.listdir(validation_image_path))
# 训练集每个epoch有多少个batch_size
steps_per_epoch = train_num / batch_size
# 验证集每个epoch有多少个batch_size
validation_steps = validation_num / batch_size
# 标签的颜色字典,用于onehot编码
colorDict_RGB, colorDict_GRAY = color_dict(train_image_path, classNum)
# 得到一个生成器,以batch_size的速率生成训练数据
train_Generator = trainGenerator(batch_size,
train_image_path,
train_label_path,
classNum ,
colorDict_GRAY,
input_size)
# 得到一个生成器,以batch_size的速率生成验证数据
validation_data = trainGenerator(batch_size,
validation_image_path,
validation_label_path,
classNum,
colorDict_GRAY,
input_size)
# 定义模型
model = unet(pretrained_weights = premodel_path,
input_size = input_size,
classNum = classNum,
learning_rate = learning_rate)
# 打印模型结构
model.summary()
# 回调函数
model_checkpoint = ModelCheckpoint(model_path,
monitor = 'loss',
verbose = 1,# 日志显示模式:0->安静模式,1->进度条,2->每轮一行
save_best_only = True)
# 获取当前时间
start_time = datetime.datetime.now()
# 模型训练
history = model.fit_generator(train_Generator,
steps_per_epoch = steps_per_epoch,
epochs = epochs,
callbacks = [model_checkpoint],
validation_data = validation_data,
validation_steps = validation_steps)
# 训练总时间
end_time = datetime.datetime.now()
log_time = "训练总时间: " + str((end_time - start_time).seconds / 60) + "m"
print(log_time)
with open('TrainTime.txt','w') as f:
f.write(log_time)
5 绘制loss/acc曲线图
model.fit_generator()函数可以返回loss和acc数据,然后我们利用matplotlib绘制:
# 保存并绘制loss,acc
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
book = xlwt.Workbook(encoding='utf-8', style_compression=0)
sheet = book.add_sheet('test', cell_overwrite_ok=True)
for i in range(len(acc)):
sheet.write(i, 0, acc[i])
sheet.write(i, 1, val_acc[i])
sheet.write(i, 2, loss[i])
sheet.write(i, 3, val_loss[i])
book.save(r'AccAndLoss.xls')
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, 'r', label = 'Training acc')
plt.plot(epochs, val_acc, 'b', label = 'Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.savefig("accuracy.png",dpi = 300)
plt.figure()
plt.plot(epochs, loss, 'r', label = 'Training loss')
plt.plot(epochs, val_loss, 'b', label = 'Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.savefig("loss.png", dpi = 300)
plt.show()
Training and validation loss
Training and validation accuracy
6 模型预测
模型预测时数据格式要和预测时保持一致,也需要利用生成器:
# 测试数据生成器
# test_iamge_path 测试数据路径
# resize_shape resize大小
def testGenerator(test_iamge_path, resize_shape = None):
imageList = os.listdir(test_iamge_path)
for i in range(len(imageList)):
img = readTif(test_iamge_path + "" + imageList[i])
img = img.swapaxes(1, 0)
img = img.swapaxes(1, 2)
# 归一化
img = img / 255.0
if(resize_shape != None):
# 改变图像尺寸至特定尺寸
img = cv2.resize(img, (resize_shape[0], resize_shape[1]))
# 将测试图片扩展一个维度,与训练时的输入[batch_size,img.shape]保持一致
img = np.reshape(img, (1, ) + img.shape)
yield img
模型预测出的结果需要进行onehot解码并渲染保存结果:
# 保存结果
# test_iamge_path 测试数据图像路径
# test_predict_path 测试数据图像预测结果路径
# model_predict 模型的预测结果
# color_dict 颜色词典
def saveResult(test_image_path, test_predict_path, model_predict, color_dict, output_size):
imageList = os.listdir(test_image_path)
for i, img in enumerate(model_predict):
channel_max = np.argmax(img, axis = -1)
img_out = np.uint8(color_dict[channel_max.astype(np.uint8)])
# 修改差值方式为最邻近差值
img_out = cv2.resize(img_out, (output_size[0], output_size[1]), interpolation = cv2.INTER_NEAREST)
# 保存为无损压缩png
cv2.imwrite(test_predict_path + "" + imageList[i][:-4] + ".png", img_out)
7 遥感图像大图像预测
如果将较大的待分类遥感影像直接输入到网络模型中会造成内存溢出,故一般将待分类图像裁剪为一系列较小图像分别输入网络进行预测,然后将预测结果按照裁剪顺序拼接成一张最终结果图像。详见我下面这个博客:
馨意:遥感大图像深度学习忽略边缘(划窗)预测
8 精度评定
我们使用精确率(Precision)、召回率(Recall)、F1分数(F1-Score)、交并比(Intersection-over-Union, IoU)、平均交并比(mean Intersection-over-Union, mIoU)、频权交并比(Frequency Weighted Intersection-over-Union, FWIoU)等指标进行精度评定。详见我下面这个博客:
馨意:遥感图像语义分割常用精度指标及其python实现
9 全部代码
Data
-----train
----------image
----------label
-----validation
----------image
----------label
-----test
----------image
----------label
----------predict
Model
-----seg_unet.py
-----unet_model
dataProcess.py
train.py
test.py
seg_metrics.py
https://github.com/WangZhenqing-RS/Unet_RSimage_Multi-band_Multi-classgithub.com
后记
后处理等优化好我会继续更新,有问题欢迎留言评论,觉得不错可以动动手指点个赞同&喜欢