最近在用深度学习的方法进行车道线检测,现总结如下:
目前,对于车道线检测的方法主要分为两大类,一是基于传统机器视觉的方法,二是基于深度学习大方法。
一、基于传统机器视觉的方法
1. 边缘检测+霍夫变换
方法流程:彩色图像转灰度,模糊处理,边缘检测,霍夫变换
这种方法一般能够检测出简单场景下的车辆目前行驶的两条车道线,以及偶尔的相邻车道(依赖前视相机的角度)。该方法可以利用霍夫变换的结果(线的斜率),进一步过滤出左右车道线。不过同时,该方法也依赖于边缘检测的结果,所以调参(边缘检测、霍夫变换)以及其他的trick(roi选取等等)是很重要的。
2. 颜色阈值
方法流程:将图像转颜色空间(一般HSV),对新的color space中的各个通道设置阈值(大于阈值取值为1,小于取值为0),得到结果。
该方法依赖于各通道的阈值的选取,只需要调整几个阈值参数,但个人认为该方法鲁棒性会较差,例如当前车辆前方的车辆可能会被全部置1。
3. 透视变换
方法流程:获取透视变换矩阵,透视变换,车道线检测(1或者2)
该方法的优点是将前视摄像头抓拍的图像转为鸟瞰图,能够检测到多条线。其关键在于透视变换矩阵的准确性(不考虑转换后的车道线检测),对于转换后的鸟瞰图,可以通过上述两种方式检测车道线。
在实际场景中,传统方法的鲁棒性确实不行,除去光照和邻近车辆的影响外,车道中间的指示箭头和人行道也是此类算法很难处理的挑战。
上述方法的总结建议参考该博文
二、基于深度学习的方法
小编主要讲的是如何通过训练一个深度神经网络对车道线进行语义分割,从而实现车道线检测。SegNet网络是一种很有趣的图像分割技术,是一种encoding-decoding的结构,在使用时可直接调用标准的模型结构。小编训练的网络便是基于SegNet网络构建的。
初始的网络结构如下图:
1.数据集
原图:
分割图(label):
原图点此处下载(https://www.dropbox.com/s/rrh8lrdclzlnxzv/full_CNN_train.p?dl=0)
数据集labels点此处下载(https://www.dropbox.com/s/ak850zqqfy6ily0/full_CNN_labels.p?dl=0)
初版的网络model如下:
""" This file contains code for a fully convolutional
(i.e. contains zero fully connected layers) neural network
for detecting lanes. This version assumes the inputs
to be road images in the shape of 80 x 160 x 3 (RGB) with
the labels as 80 x 160 x 1 (just the G channel with a
re-drawn lane). Note that in order to view a returned image,
the predictions is later stacked with zero'ed R and B layers
and added back to the initial road image.
"""
import numpy as np
import pickle
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
# Import necessary items from Keras
from keras.models import Sequential
from keras.layers import Activation, Dropout, UpSampling2D, concatenate
from keras.layers import Conv2DTranspose, Conv2D, MaxPooling2D
from keras.layers.normalization import BatchNormalization
from keras.preprocessing.image import ImageDataGenerator
from keras import regularizers
# Load training images
train_images = pickle.load(open("full_CNN_train.p", "rb" ))
# Load image labels
labels = pickle.load(open("full_CNN_labels.p", "rb" ))
# Make into arrays as the neural network wants these
train_images = np.array(train_images)
labels = np.array(labels)
# Normalize labels - training images get normalized to start in the network
labels = labels / 255
# Shuffle images along with their labels, then split into training/validation sets
train_images, labels = shuffle(train_images, labels)
# Test size may be 10% or 20%
X_train, X_val, y_train, y_val = train_test_split(train_images, labels, test_size=0.1)
# Batch size, epochs and pool size below are all paramaters to fiddle with for optimization
batch_size = 16
epochs = 10
pool_size = (2, 2)
input_shape = X_train.shape[1:]
### Here is the actual neural network ###
model = Sequential()
# Normalizes incoming inputs. First layer needs the input shape to work
model.add(BatchNormalization(input_shape=input_shape))
# Below layers were re-named for easier reading of model summary; this not necessary
# Conv Layer 1
model.add(Conv2D(8, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Conv1'))
# Conv Layer 2
model.add(Conv2D(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Conv2'))
# Pooling 1
model.add(MaxPooling2D(pool_size=pool_size))
# Conv Layer 3
model.add(Conv2D(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Conv3'))
model.add(Dropout(0.2))
# Conv Layer 4
model.add(Conv2D(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Conv4'))
model.add(Dropout(0.2))
# Conv Layer 5
model.add(Conv2D(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Conv5'))
model.add(Dropout(0.2))
# Pooling 2
model.add(MaxPooling2D(pool_size=pool_size))
# Conv Layer 6
model.add(Conv2D(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Conv6'))
model.add(Dropout(0.2))
# Conv Layer 7
model.add(Conv2D(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Conv7'))
model.add(Dropout(0.2))
# Pooling 3
model.add(MaxPooling2D(pool_size=pool_size))
# Upsample 1
model.add(UpSampling2D(size=pool_size))
# Deconv 1
model.add(Conv2DTranspose(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Deconv1'))
model.add(Dropout(0.2))
# Deconv 2
model.add(Conv2DTranspose(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Deconv2'))
model.add(Dropout(0.2))
# Upsample 2
model.add(UpSampling2D(size=pool_size))
# Deconv 3
model.add(Conv2DTranspose(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Deconv3'))
model.add(Dropout(0.2))
# Deconv 4
model.add(Conv2DTranspose(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Deconv4'))
model.add(Dropout(0.2))
# Deconv 5
model.add(Conv2DTranspose(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Deconv5'))
model.add(Dropout(0.2))
model.get_layer('Conv4')
# Upsample 3
model.add(UpSampling2D(size=pool_size))
# Deconv 6
model.add(Conv2DTranspose(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Deconv6'))
# Final layer - only including one channel so 1 filter
model.add(Conv2DTranspose(1, (3, 3), padding='valid', strides=(1,1), activation = 'relu', name = 'Final'))
### End of network ###
# Using a generator to help the model use less data
# Channel shifts help with shadows slightly
datagen = ImageDataGenerator(channel_shift_range=0.2)
datagen.fit(X_train)
# Compiling and training the model
model.compile(optimizer='Adam', loss='mean_squared_error')
model.fit_generator(datagen.flow(X_train, y_train, batch_size=batch_size), steps_per_epoch=len(X_train)/batch_size,
epochs=epochs, verbose=1, validation_data=(X_val, y_val))
# Freeze layers since training is done
model.trainable = False
model.compile(optimizer='Adam', loss='mean_squared_error')
# Save model architecture and weights
model.save('full_CNN_model.h5')
# Show summary of model
model.summary()
上述model的建立参考链接(https://github.com/mvirgo/MLND-Capstone)
三、改进网络的model
上述的网络model在车道线检测的时候效果并不稳定,所以对其进行了改进。
1.增加了网络的层数,修改了每层卷积核的个数。
2.受Unet网络的启发,采用并联跳跃结构将encoding的feature map 与 decoding的feature map进行连接,这样可以在进行分类预测时利用多层信息。
3.将UpSampling2D改为Conv2DTranspose实现上采样的过程,UpSampling2D直接采用原像素值进行填补不存在学习的过程,而Conv2DTranspose存在学习的过程,效果更好。
训练
改进后的网络model如下:
import numpy as np
import pickle
#import cv2
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
# Import necessary items from Keras
from keras.models import Model
from keras.layers import Activation, Dropout, UpSampling2D, concatenate, Input
from keras.layers import Conv2DTranspose, Conv2D, MaxPooling2D
from keras.layers.normalization import BatchNormalization
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import plot_model
from keras import regularizers
# Load training images
train_images = pickle.load(open("full_CNN_train.p", "rb" ))
# Load image labels
labels = pickle.load(open("full_CNN_labels.p", "rb" ))
# Make into arrays as the neural network wants these
train_images = np.array(train_images)
labels = np.array(labels)
# Normalize labels - training images get normalized to start in the network
labels = labels / 255
# Shuffle images along with their labels, then split into training/validation sets
train_images, labels = shuffle(train_images, labels)
# Test size may be 10% or 20%
X_train, X_val, y_train, y_val = train_test_split(train_images, labels, test_size=0.1)
# Batch size, epochs and pool size below are all paramaters to fiddle with for optimization
batch_size = 16
epochs = 15
pool_size = (2, 2)
#input_shape = X_train.shape[1:]
### Here is the actual neural network ###
# Normalizes incoming inputs. First layer needs the input shape to work
#BatchNormalization(input_shape=input_shape)
Inputs = Input(batch_shape=(None, 80, 160, 3))
# Below layers were re-named for easier reading of model summary; this not necessary
# Conv Layer 1
Conv1 = Conv2D(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Inputs)
Bat1 = BatchNormalization()(Conv1)
# Conv Layer 2
Conv2 = Conv2D(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Conv1)
Bat2 = BatchNormalization()(Conv2)
# Pooling 1
Pool1 = MaxPooling2D(pool_size=pool_size)(Conv2)
# Conv Layer 3
Conv3 = Conv2D(32, (3, 3), padding = 'valid', strides=(1,1), activation = 'relu')(Pool1)
#Drop3 = Dropout(0.2)(Conv3)
Bat3 = BatchNormalization()(Conv3)
# Conv Layer 4
Conv4 = Conv2D(32, (3, 3), padding = 'valid', strides=(1,1), activation = 'relu')(Bat3)
#Drop4 = Dropout(0.5)(Conv4)
Bat4 = BatchNormalization()(Conv4)
# Conv Layer 5
Conv5 = Conv2D(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Bat4)
#Drop5 = Dropout(0.2)(Conv5)
Bat5 = BatchNormalization()(Conv5)
# Pooling 2
Pool2 = MaxPooling2D(pool_size=pool_size)(Bat5)
# Conv Layer 6
Conv6 = Conv2D(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Pool2)
#Drop6 = Dropout(0.2)(Conv6)
Bat6 = BatchNormalization()(Conv6)
# Conv Layer 7
Conv7 = Conv2D(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Bat6)
#Drop7 = Dropout(0.2)(Conv7)
Bat7 = BatchNormalization()(Conv7)
# Pooling 3
Pool3 = MaxPooling2D(pool_size=pool_size)(Bat7)
# Conv Layer 8
Conv8 = Conv2D(128, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Pool3)
#Drop8 = Dropout(0.2)(Conv8)
Bat8 = BatchNormalization()(Conv8)
# Conv Layer 9
Conv9 = Conv2D(128, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Bat8)
#Drop9 = Dropout(0.2)(Conv9)
Bat9 = BatchNormalization()(Conv9)
# Pooling 4
Pool4 = MaxPooling2D(pool_size=pool_size)(Bat9)
# Upsample 1 to Deconv 1
Deconv1 = Conv2DTranspose(128, (2, 2), padding='valid', strides=(2,2), activation = 'relu')(Pool4)
#Up1 = UpSampling2D(size=pool_size)(Pool4)
Mer1 = concatenate([Deconv1, Bat9], axis=-1)
# Deconv 2
Deconv2 = Conv2DTranspose(128, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer1)
DBat2 = BatchNormalization()(Deconv2)
# Deconv 3
Deconv3 = Conv2DTranspose(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat2)
DBat3 = BatchNormalization()(Deconv3)
# Upsample 2 to Deconv 4
Deconv4 = Conv2DTranspose(64, (2, 2), padding='valid', strides=(2,2), activation = 'relu')(DBat3)
#Up2 = UpSampling2D(size=pool_size)(DBat2)
Mer2 = concatenate([Deconv4, Bat7], axis=-1)
# Deconv 5
Deconv5 = Conv2DTranspose(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer2)
DBat5 = BatchNormalization()(Deconv5)
# Deconv 6
Deconv6 = Conv2DTranspose(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat5)
DBat6 = BatchNormalization()(Deconv6)
# Upsample 3 to Deconv 7
Deconv7 = Conv2DTranspose(32, (2, 2), padding='valid', strides=(2,2), activation = 'relu')(DBat6)
#Up3 = UpSampling2D(size=pool_size)(DBat4)
Mer3 = concatenate([Deconv7, Bat5], axis=-1)
# Deconv 8
Deconv8 = Conv2DTranspose(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer3)
DBat8 = BatchNormalization()(Deconv8)
# Deconv 9
Deconv9 = Conv2DTranspose(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat8)
DBat9 = BatchNormalization()(Deconv9)
# Deconv 10
Deconv10 = Conv2DTranspose(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat9)
DBat10 = BatchNormalization()(Deconv10)
# Upsample 4 to Deconv 11
Deconv11 = Conv2DTranspose(16, (2, 2), padding='valid', strides=(2,2), activation = 'relu')(DBat10)
#Up4 = UpSampling2D(size=pool_size)(DBat7)
Mer4 = concatenate([Deconv11, Bat2], axis=-1)
# Deconv 12
Deconv12 = Conv2DTranspose(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer4)
DBat12 = BatchNormalization()(Deconv12)
# Deconv 13
Deconv13 = Conv2DTranspose(8, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat12)
DBat13 = BatchNormalization()(Deconv13)
# Final layer - only including one channel so 1 filter
Final = Conv2DTranspose(1, (3, 3), padding='same', strides=(1,1), activation = 'relu')(DBat13)
### End of network ###
model = Model(inputs=Inputs, outputs=Final)
# Using a generator to help the model use less data
# Channel shifts help with shadows slightly
datagen = ImageDataGenerator(channel_shift_range=0.2)
datagen.fit(X_train)
# Compiling and training the model
model.compile(optimizer='Adam', loss='mean_squared_error')
model.fit_generator(datagen.flow(X_train, y_train, batch_size=batch_size), steps_per_epoch=len(X_train)/batch_size,
epochs=epochs, verbose=1, validation_data=(X_val, y_val))
# Freeze layers since training is done
model.trainable = False
model.compile(optimizer='Adam', loss='mean_squared_error')
# Save model architecture and weights
model.save('full_CNN_model_HYe15.h5')
# Show summary of model
model.summary()
plot_model(model, to_file='model.png')
测试
将网络model训练好以后,进行测试,每帧图像的检测结果是其前5帧图像检测结果的平均。
import numpy as np
import cv2
from scipy.misc import imresize
from moviepy.editor import VideoFileClip
from IPython.display import HTML
from keras.models import load_model
import matplotlib.pyplot as plt
# Load Keras model
model = load_model('full_CNN_model_HY.h5')
# Class to average lanes with
class Lanes():
def __init__(self):
self.recent_fit = []
self.avg_fit = []
def road_lines(image):
""" Takes in a road image, re-sizes for the model,
predicts the lane to be drawn from the model in G color,
recreates an RGB image of a lane and merges with the
original road image.
"""
# Get image ready for feeding into model
small_img = imresize(image, (80, 160, 3))
small_img = np.array(small_img)
small_img = small_img[None,:,:,:]
# Make prediction with neural network (un-normalize value by multiplying by 255)
prediction = model.predict(small_img)[0] * 255
# Add lane prediction to list for averaging
lanes.recent_fit.append(prediction)
# Only using last five for average
if len(lanes.recent_fit) > 5:
lanes.recent_fit = lanes.recent_fit[1:]
# Calculate average detection
lanes.avg_fit = np.mean(np.array([i for i in lanes.recent_fit]), axis = 0)
# Generate fake R & B color dimensions, stack with G
blanks = np.zeros_like(lanes.avg_fit).astype(np.uint8)
lane_drawn = np.dstack((blanks, lanes.avg_fit, blanks))
# Re-size to match the original image
lane_image = imresize(lane_drawn, (720, 1280, 3))
#plt.imshow(lane_image)
#plt.show()
# Merge the lane drawing onto the original image
result = cv2.addWeighted(image, 1, lane_image, 1, 0)
return result
lanes = Lanes()
# Where to save the output video
vid_output = 'project_video_hy.mp4'
# Location of the input video
clip1 = VideoFileClip("project_video.mp4")
vid_clip = clip1.fl_image(road_lines)
vid_clip.write_videofile(vid_output, audio=False)
测试结果