目录
简要介绍
文件准备
代码注释
简要介绍
具体的介绍可以看这几篇文章,讲解的很详细了,本文主要参考这三篇文章并对官方给的代码做一些解释
ICDAR2013文本检测算法的衡量方法(一)Evaluation Levels
ICDAR2013文本检测算法的衡量方法(二)Rectangle Matching与DetEval
中文文字检测与识别的评测方法
文本定位分为Challenges 1、2、4三个挑战,其中
- Challenges 1(Born-Digital)的数据来源于电脑制作的,而Challenges 2和Challenges 4(Real Scene)的数据来源于摄像机的拍摄。
- Challenges 2主要是来源于用户有意识的对焦拍摄的(focused text)比如一些翻译的场景,这些场景中文字基本是对焦好的且水平的
- Challenges 4主要来源也是用户拍摄的,但是这些照片的拍摄是比较随意的(incidental text)这样会导致图片里的文字角度、清晰度、大小等情况非常的多。
针对不同的挑战,评价检测算法的方法就不相同:
- Challenges 1和2使用的是叫做 DetEval的方法,该方法来自2006年C. Wolf的一篇文章《Object Count / Area Graphs for the Evaluation of Object Detection and Segmentation Algorithms》,ICDAR自己实现了一套Deteval方法
- Challenges 4使用的是简单的通过IoU来判定算法的recall、precision的。
官方提供的三种评测方法下载地址 https://rrc.cvc.uab.es/?ch=2&com=mymethods&task=1,其中第一个ICDAR2013方法是官方根据论文实现的Deteval,与paper作者的实现有细微差别,具体见https://rrc.cvc.uab.es/?ch=2&com=faq。第二个就是Deteval原paper作者的实现,第三个是Iou的评测方法。
文件准备
本文的代码是官方实现的Deteval方法即上面链接中的第一个ICDAR2013。运行命令如下
- python script.py –g=gt.zip –s=submit.zip –o=./ -p={\"IOU_CONSTRAINT\":0.8}
其中gt.zip是标签文件,submit.zip是模型的识别结果,-o是保存result.zip的路径,result.zip中是每张图片的评测结果,-p是评测中用到的一些阈值,在script.py中的default_evaluation_params函数中定义了默认值,需要改动也可以在代码中改。因为代码中是按正则的方式读取txt文件,因此txt的名称和内容格式都需要严格按照代码中设定的格式。最方便的方法还是将需要评测的gt.zip和submit.zip转换成代码中的格式而不是去改代码。其中gt.zip和submit.zip中的txt文件名称如下所示
txt内保存水平框的xmin,ymin,xmax,ymax,icdar提供的gt中还有文本的内容
代码注释
评价识别结果考虑了三种情况,一对一、一对多、多对一
当评测一个ground truth文本实例和一个检测结果实例时,recall是它们的Intersect Area除以GT框的面积,precision是Intersection Area除以检测框的面积。
调用评测代码的入口是下面这行代码,其中default_evaluation_params是用到的一些阈值和参数,validate_data是检测传入的gt.zip和submit.zip是否按照要求的格式,evaluate_method则是评价检测结果的函数,代码如下
def evaluate_method(gtFilePath, submFilePath, evaluationParams):
"""
Method evaluate_method: evaluate method and returns the results
Results. Dictionary with the following values:
- method (required) Global method metrics. Ex: { 'Precision':0.8,'Recall':0.9 }
- samples (optional) Per sample metrics. Ex: {'sample1' : { 'Precision':0.8,'Recall':0.9 } , 'sample2' : { 'Precision':0.8,'Recall':0.9 }
"""
# for module, alias in evaluation_imports().items():
# globals()[alias] = importlib.import_module(module)
def one_to_one_match(row, col):
cont = 0
for j in range(len(recallMat[0])): # len(detRects)
if recallMat[row, j] >= evaluationParams['AREA_RECALL_CONSTRAINT'] and precisionMat[row, j] >= \
evaluationParams['AREA_PRECISION_CONSTRAINT']: # 0.8,0.4
cont = cont + 1
if cont != 1:
return False
cont = 0
for i in range(len(recallMat)):
if recallMat[i, col] >= evaluationParams['AREA_RECALL_CONSTRAINT'] and precisionMat[i, col] >= \
evaluationParams['AREA_PRECISION_CONSTRAINT']:
cont = cont + 1
if cont != 1:
return False
if recallMat[row, col] >= evaluationParams['AREA_RECALL_CONSTRAINT'] and precisionMat[row, col] >= \
evaluationParams['AREA_PRECISION_CONSTRAINT']:
return True
return False
def one_to_many_match(gtNum):
many_sum = 0
detRects = []
for detNum in range(len(recallMat[0])):
if gtRectMat[gtNum] == 0 and detRectMat[detNum] == 0 and detNum not in detDontCareRectsNum:
if precisionMat[gtNum, detNum] >= evaluationParams['AREA_PRECISION_CONSTRAINT']: # 0.4
many_sum += recallMat[gtNum, detNum]
detRects.append(detNum)
if many_sum >= evaluationParams['AREA_RECALL_CONSTRAINT']: # 0.8
return True, detRects
else:
return False, []
def many_to_one_match(detNum):
many_sum = 0
gtRects = []
for gtNum in range(len(recallMat)):
if gtRectMat[gtNum] == 0 and detRectMat[detNum] == 0 and gtNum not in gtDontCareRectsNum:
if recallMat[gtNum, detNum] >= evaluationParams['AREA_RECALL_CONSTRAINT']: # 0.8
many_sum += precisionMat[gtNum, detNum]
gtRects.append(gtNum)
if many_sum >= evaluationParams['AREA_PRECISION_CONSTRAINT']: # 0.4
return True, gtRects
else:
return False, []
def area(a, b):
dx = min(a.xmax, b.xmax) - max(a.xmin, b.xmin) + 1
dy = min(a.ymax, b.ymax) - max(a.ymin, b.ymin) + 1
if (dx >= 0) and (dy >= 0):
return dx * dy
else:
return 0.
def center(r):
x = float(r.xmin) + float(r.xmax - r.xmin + 1) / 2.
y = float(r.ymin) + float(r.ymax - r.ymin + 1) / 2.
return Point(x, y)
def point_distance(r1, r2):
distx = math.fabs(r1.x - r2.x)
disty = math.fabs(r1.y - r2.y)
return math.sqrt(distx * distx + disty * disty)
def center_distance(r1, r2):
return point_distance(center(r1), center(r2))
def diag(r):
w = (r.xmax - r.xmin + 1)
h = (r.ymax - r.ymin + 1)
return math.sqrt(h * h + w * w)
perSampleMetrics = {}
methodRecallSum = 0
methodPrecisionSum = 0
Rectangle = namedtuple('Rectangle', 'xmin ymin xmax ymax')
Point = namedtuple('Point', 'x y')
gt = rrc_evaluation_funcs.load_zip_file(gtFilePath, evaluationParams['GT_SAMPLE_NAME_2_ID']) # 'gt_img_([0-9]+).txt'
subm = rrc_evaluation_funcs.load_zip_file(submFilePath, evaluationParams['DET_SAMPLE_NAME_2_ID'], True) # 'res_img_([0-9]+).txt'
numGt = 0
numDet = 0
for resFile in gt: # 一张ground truth
# 189, b'121,0,177,12,###\r\n90,110,528,193,hellmann\r\n89,197,269,251,parcel\r\n295,204,528,254,systems\r\n'
gtFile = rrc_evaluation_funcs.decode_utf8(gt[resFile])
# 121,0,177,12,###
# 90,110,528,193,hellmann
# 89,197,269,251,parcel
# 295,204,528,254,systems
recall = 0
precision = 0
hmean = 0
recallAccum = 0.
precisionAccum = 0.
gtRects = []
detRects = []
gtPolPoints = []
detPolPoints = []
gtDontCareRectsNum = [] # Array of Ground Truth Rectangles' keys marked as don't Care
detDontCareRectsNum = [] # Array of Detected Rectangles' matched with a don't Care GT
pairs = []
evaluationLog = ""
recallMat = np.empty([1, 1])
precisionMat = np.empty([1, 1])
pointsList, _, transcriptionsList = rrc_evaluation_funcs.get_tl_line_values_from_file_contents(gtFile,
evaluationParams[
'CRLF'],
# False
True, True,
False)
for n in range(len(pointsList)): # 一张ground truth上的一个文本实例
points = pointsList[n] # list, [121.0, 0.0, 177.0, 12.0]
transcription = transcriptionsList[n]
dontCare = transcription == "###"
gtRect = Rectangle(*points) # Rectangle(xmin=121.0, ymin=0.0, xmax=177.0, ymax=12.0)
gtRects.append(gtRect)
gtPolPoints.append(points)
if dontCare:
gtDontCareRectsNum.append(len(gtRects) - 1) # 一张图中dontcare文本实例的索引
evaluationLog += "GT rectangles: " + str(len(gtRects)) + (
" (" + str(len(gtDontCareRectsNum)) + " don't care)\n" if len(gtDontCareRectsNum) > 0 else "\n")
# GT rectangles: 4 (1 don't care)
if resFile in subm: # 对应这张ground truth的识别结果
detFile = rrc_evaluation_funcs.decode_utf8(subm[resFile])
# 一张图片的检测结果
# <class 'str'>
# 121,0,177,12
# 90,110,528,193
# 89,197,269,251
# 295,204,528,254
pointsList, _, _ = rrc_evaluation_funcs.get_tl_line_values_from_file_contents(detFile,
evaluationParams['CRLF'],
True, False, False)
# [[121.0, 0.0, 177.0, 12.0], [90.0, 110.0, 528.0, 193.0], [89.0, 197.0, 269.0, 251.0], [295.0, 204.0, 528.0, 254.0]]
for n in range(len(pointsList)): # 这张识别结果中的一个文本实例
points = pointsList[n]
detRect = Rectangle(*points)
detRects.append(detRect)
detPolPoints.append(points)
if len(gtDontCareRectsNum) > 0:
for dontCareRectNum in gtDontCareRectsNum: # 遍历对应gt中的don't care文本实例
dontCareRect = gtRects[dontCareRectNum]
intersected_area = area(dontCareRect, detRect) # 拿这张图识别结果中的一个文本实例和对应gt中所有标为don't care的计算intersection area
rdDimensions = ((detRect.xmax - detRect.xmin + 1) * (detRect.ymax - detRect.ymin + 1))
if rdDimensions == 0:
precision = 0
else:
precision = intersected_area / rdDimensions
if precision > evaluationParams['AREA_PRECISION_CONSTRAINT']: # 0.4
detDontCareRectsNum.append(len(detRects) - 1)
# 识别结果中的一个文本实例和gt中标为don't care的一个文本实例匹配上了,记录这个识别的文本实例的索引
break
# 可能还与其它的don't care实例也匹配,但不关心了,break
evaluationLog += "DET rectangles: " + str(len(detRects)) + (
" (" + str(len(detDontCareRectsNum)) + " don't care)\n" if len(detDontCareRectsNum) > 0 else "\n")
# DET rectangles: 4 (1 don't care)
if len(gtRects) == 0: # 这张图上没有gt box,dont't care也没有
recall = 1
precision = 0 if len(detRects) > 0 else 1
if len(detRects) > 0:
# Calculate recall and precision matrices
outputShape = [len(gtRects), len(detRects)] # 一行是1个gt,一列是1个det
recallMat = np.empty(outputShape) # 记录了一张gt中的每个文本实例与对应det中的每个文本实例的recall
precisionMat = np.empty(outputShape) # 记录了一张gt中的每个文本实例与对应det中的每个文本实例的precision
gtRectMat = np.zeros(len(gtRects), np.int8) # 记录每个gt是否匹配,匹配则值为1
detRectMat = np.zeros(len(detRects), np.int8) # 记录每个det是否匹配,匹配则值为1
# 分别遍历gt和det中的每一个text instance
for gtNum in range(len(gtRects)):
for detNum in range(len(detRects)):
rG = gtRects[gtNum]
rD = detRects[detNum]
intersected_area = area(rG, rD)
rgDimensions = ((rG.xmax - rG.xmin + 1) * (rG.ymax - rG.ymin + 1))
rdDimensions = ((rD.xmax - rD.xmin + 1) * (rD.ymax - rD.ymin + 1))
recallMat[gtNum, detNum] = 0 if rgDimensions == 0 else intersected_area / rgDimensions
# 召回率: 一个gt和一个det的intersection面积与该gt面积的比例
precisionMat[gtNum, detNum] = 0 if rdDimensions == 0 else intersected_area / rdDimensions
# 准确率: 一个gt和一个det的intersection面积与该det面积的比例
# 在接下来判断一对一、一对多、多对一三种情况时,都忽略don't care的gt和det
# Find one-to-one matches
evaluationLog += "Find one-to-one matches\n"
for gtNum in range(len(gtRects)):
for detNum in range(len(detRects)):
if gtRectMat[gtNum] == 0 and detRectMat[detNum] == 0 and gtNum not in gtDontCareRectsNum and detNum not in detDontCareRectsNum:
match = one_to_one_match(gtNum, detNum)
# 一对一,即1个gt只与1个det满足recall和precision的阈值,同时这个det也只与这个gt满足recall和precision的阈值
# recallMat的一行代表一个gt和所有det的recall,一列代表一个det和所有gt的recall,precisionMat也是如此
# 当前gt所在的行与当前det所在的列中只有交叉点即该gt与det的recall和precision大于设定的阈值,即1对1匹配
if match is True:
rG = gtRects[gtNum]
rD = detRects[detNum]
normDist = center_distance(rG, rD)
normDist /= diag(rG) + diag(rD)
normDist *= 2.0
if normDist < evaluationParams['EV_PARAM_IND_CENTER_DIFF_THR']: # 1
gtRectMat[gtNum] = 1 # 当前gt已匹配,后面判断一对多和多对一的情况时,跳过该gt
detRectMat[detNum] = 1 # 当前det已匹配,后面判断一对多和多对一的情况时,跳过该det
recallAccum += evaluationParams['MTYPE_OO_O'] # 1
precisionAccum += evaluationParams['MTYPE_OO_O']
pairs.append({'gt': gtNum, 'det': detNum, 'type': 'OO'})
evaluationLog += "Match GT #" + str(gtNum) + " with Det #" + str(detNum) + "\n"
else:
evaluationLog += "Match Discarded GT #" + str(gtNum) + " with Det #" + str(
detNum) + " normDist: " + str(normDist) + " \n"
# Find one-to-many matches
evaluationLog += "Find one-to-many matches\n"
for gtNum in range(len(gtRects)):
if gtNum not in gtDontCareRectsNum:
match, matchesDet = one_to_many_match(gtNum)
# 一对多,即1个gt对应多个det,遍历1个gt所在的行,若有det与该gt的precision大于设定阈值,记录该det以及gt与该det的recall
# 若所有满足precision的det与gt的recall和大于设定阈值,则该gt与所有记录的det匹配上了
if match is True:
gtRectMat[gtNum] = 1
recallAccum += evaluationParams['MTYPE_OM_O'] # 0.8
precisionAccum += evaluationParams['MTYPE_OM_O'] * len(matchesDet)
pairs.append({'gt': gtNum, 'det': matchesDet, 'type': 'OM'})
for detNum in matchesDet:
detRectMat[detNum] = 1
evaluationLog += "Match GT #" + str(gtNum) + " with Det #" + str(matchesDet) + "\n"
# Find many-to-one matches
evaluationLog += "Find many-to-one matches\n"
for detNum in range(len(detRects)):
if detNum not in detDontCareRectsNum:
match, matchesGt = many_to_one_match(detNum)
# 多对一,即多个gt对应1个det,遍历1个det所在的列,若有gt与该det的recall大于设定阈值,记录该gt以及det与该gt的precision
# 若所有满足recall的gt与该det的precision和大于设定阈值,则该det与所有记录的gt匹配上了
if match is True:
detRectMat[detNum] = 1
recallAccum += evaluationParams['MTYPE_OM_M'] * len(matchesGt) # 1
precisionAccum += evaluationParams['MTYPE_OM_M']
pairs.append({'gt': matchesGt, 'det': detNum, 'type': 'MO'})
for gtNum in matchesGt:
gtRectMat[gtNum] = 1
evaluationLog += "Match GT #" + str(matchesGt) + " with Det #" + str(detNum) + "\n"
numGtCare = (len(gtRects) - len(gtDontCareRectsNum))
if numGtCare == 0:
recall = float(1)
precision = float(0) if len(detRects) > 0 else float(1)
else:
recall = float(recallAccum) / numGtCare
precision = float(0) if (len(detRects) - len(detDontCareRectsNum)) == 0 else float(
precisionAccum) / (len(detRects) - len(detDontCareRectsNum))
hmean = 0 if (precision + recall) == 0 else 2.0 * precision * recall / (precision + recall)
evaluationLog += "Recall = " + str(recall) + "\n"
evaluationLog += "Precision = " + str(precision) + "\n"
methodRecallSum += recallAccum
methodPrecisionSum += precisionAccum
numGt += len(gtRects) - len(gtDontCareRectsNum)
numDet += len(detRects) - len(detDontCareRectsNum)
perSampleMetrics[resFile] = {
'precision': precision,
'recall': recall,
'hmean': hmean,
'pairs': pairs,
'recallMat': [] if len(detRects) > 100 else recallMat.tolist(),
'precisionMat': [] if len(detRects) > 100 else precisionMat.tolist(),
'gtPolPoints': gtPolPoints,
'detPolPoints': detPolPoints,
'gtDontCare': gtDontCareRectsNum,
'detDontCare': detDontCareRectsNum,
'evaluationParams': evaluationParams,
'evaluationLog': evaluationLog
}
methodRecall = 0 if numGt == 0 else methodRecallSum / numGt
methodPrecision = 0 if numDet == 0 else methodPrecisionSum / numDet
methodHmean = 0 if methodRecall + methodPrecision == 0 else 2 * methodRecall * methodPrecision / (
methodRecall + methodPrecision)
methodMetrics = {'precision': methodPrecision, 'recall': methodRecall, 'hmean': methodHmean}
resDict = {'calculated': True, 'Message': '', 'method': methodMetrics, 'per_sample': perSampleMetrics}
return resDict