目标检测 YOLOv5 - 早停机制(Early Stopping)
flyfish
Early Stopping but when? YOLOv5:v5的版本没有早停机制,在2021年9月5日后的版本更新了早停机制
EarlyStopper updates #4679(Sep 5, 2021)
参数
patience:训练了多少个epoch,如果模型效果未提升,就让模型提前停止训练。
fitness监控的是增大的数值,例如mAP,如果mAP在连续训练patience次内没有增加就停止训练。
如何使用
方式需要两步
一 声明,初始化patience参数
stopper = EarlyStopping(patience=3)
二 训练过程中判断是否需要早停
stopper(epoch=epoch, fitness=mAP的数值)
下例中使用随机数替代传入的mAP,并编写代码进行测试
import random
class EarlyStopping:
# YOLOv5 simple early stopper
def __init__(self, patience=30):
self.best_fitness = 0.0 # i.e. mAP
self.best_epoch = 0
self.patience = patience or float('inf') # epochs to wait after fitness stops improving to stop
self.possible_stop = False # possible stop may occur next epoch
def __call__(self, epoch, fitness):
if fitness >= self.best_fitness: # >= 0 to allow for early zero-fitness stage of training
self.best_epoch = epoch
self.best_fitness = fitness
delta = epoch - self.best_epoch # epochs without improvement
print("delta:",delta)
print("best_fitness:", self.best_fitness)
self.possible_stop = delta >= (self.patience - 1) # possible stop may occur next epoch
stop = delta >= self.patience # stop training if patience exceeded
if stop:
print(f'EarlyStopping patience {self.patience} exceeded, stopping training.')
return stop
#我们编写如下代码进行测试
stopper = EarlyStopping(patience=3)
epochs=10
start_epoch=0
for epoch in range(start_epoch, epochs):
random.seed(epoch)
print("function:",stopper(epoch=epoch, fitness=random.uniform(0.1, 0.5)))
print("possible_stop:",stopper.possible_stop)
输出结果
# delta: 0
# best_fitness: 0.43776874061001925
# function: False
# possible_stop: False
# delta: 1
# best_fitness: 0.43776874061001925
# function: False
# possible_stop: False
# delta: 0
# best_fitness: 0.4824137087556998
# function: False
# possible_stop: False
# delta: 1
# best_fitness: 0.4824137087556998
# function: False
# possible_stop: False
# delta: 2
# best_fitness: 0.4824137087556998
# function: False
# possible_stop: True
# delta: 3
# best_fitness: 0.4824137087556998
# EarlyStopping patience 3 exceeded, stopping training.
# function: True
# possible_stop: True
改进方法
YOLOv5自带的早停机制只能监控不断增大的数值,如果改进可以这样,改进方式如下
(1)增加参数mode,max表示监控增大的数值,min表示监控减小的数值,既能监控不断增大的数值例如mAP,也可以监控不断减少的数值例如loss
(2)如果再精确设置一个最小改变值的参数diff,变化范围太小我们也认为模型效果未提升,代码就不是if fitness >= self.best_fitness 这种比较,而是相减。
(3)当多个模型比较谁厉害的时候,可以设置baseline,训练了多少个epoch,mAP还没有超过baseline也早早停止,不浪费资源了。