【算法介绍】
模型描述
在实人认证、文档电子化等场景中需要自动化提取卡证的信息,以便进一步做录入处理。这类场景通常存在两类问题,一是识别卡证类型时易受背景干扰,二是卡证拍摄角度造成的文字畸变影响OCR准确率。鉴于证件类数据的敏感性,我们采用大量合成卡证数据做训练, 并改造人脸检测SOTA方法SCRFD训练了卡证检测矫正模型,可以对各类国际常见卡证(如,身份证、护照、驾照等)进行检测、定位及矫正,得到去除背景的正视角卡证图像,便于后续卡证分类或OCR内容提取。
训练数据:
效果展示:
使用方式和范围
使用方式:
- 推理:输入图片,如存在卡证则返回卡证及角点位置,以及每个矫正后的卡证图片
- 调优:采用自有数据对模型进行效果优化
目标场景:
- 卡证相关的前置基础能力,可应用于卡证OCR/证件分类/证件防伪等场景
代码范例
import cv2
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
card_detection = pipeline(Tasks.card_detection, 'damo/cv_resnet_carddetection_scrfd34gkps')
img_path = 'https://design3d.oss-cn-qingdao.aliyuncs.com/MS_test_img/card_detection.jpg'
result = card_detection(img_path)
# if you want to show the result, you can run
from modelscope.utils.cv.image_utils import draw_card_detection_result
from modelscope.preprocessors.image import LoadImage
import matplotlib.pyplot as plt
img = LoadImage.convert_to_ndarray(img_path)
cv2.imwrite('srcImg.jpg', img)
img_list = draw_card_detection_result('srcImg.jpg', result)
for i, img in enumerate(img_list):
plt.figure()
plt.imshow(img_list[i])
数据集
- SyntheticCards: 采用开源数据素材合成的虚拟卡证数据,并已上传至ModelScope的DatasetHub;
- 自有数据:如需使用自己的数据优化模型,请按照如下格式准备标注信息,其中角点顺序为左下、左上、右上、右下,每个角点格式为(x,y,1)
# <image_path> image_width image_height
bbox_x1 bbox_y1 bbox_x2 bbox_y2 (<keypoint,3>*4)
...
...
# <image_path> image_width image_height
bbox_x1 bbox_y1 bbox_x2 bbox_y2 (<keypoint,3>*4)
...
...
模型训练
通过使用托管在modelscope DatasetHub上的数据集SyntheticCards进行训练:
import os
import tempfile
from modelscope.msdatasets import MsDataset
from modelscope.metainfo import Trainers
from modelscope.trainers import build_trainer
from modelscope.hub.snapshot_download import snapshot_download
model_id = 'damo/cv_resnet_carddetection_scrfd34gkps'
ms_ds_widerface = MsDataset.load('SyntheticCards_mini', namespace='shaoxuan') # remove '_mini' for full dataset
data_path = ms_ds_widerface.config_kwargs['split_config']
train_dir = data_path['train']
val_dir = data_path['validation']
def get_name(dir_name):
names = [i for i in os.listdir(dir_name) if not i.startswith('_')]
return names[0]
train_root = train_dir + '/' + get_name(train_dir) + '/'
val_root = val_dir + '/' + get_name(val_dir) + '/'
cache_path = snapshot_download(model_id)
tmp_dir = tempfile.TemporaryDirectory().name
if not os.path.exists(tmp_dir):
os.makedirs(tmp_dir)
def _cfg_modify_fn(cfg):
cfg.checkpoint_config.interval = 1
cfg.log_config.interval = 10
cfg.evaluation.interval = 1
cfg.data.workers_per_gpu = 1
cfg.data.samples_per_gpu = 2
return cfg
kwargs = dict(
cfg_file=os.path.join(cache_path, 'mmcv_scrfd.py'),
work_dir=tmp_dir,
train_root=train_root,
val_root=val_root,
total_epochs=1, # run #epochs
cfg_modify_fn=_cfg_modify_fn)
trainer = build_trainer(name=Trainers.card_detection_scrfd, default_args=kwargs)
trainer.train()
- 更多示例(如,多卡训练)请参阅:
tests/trainers/test_card_detection_scrfd_trainer.py
- 本模型使用8卡v100,使用SGD优化器,lr=0.02,在120/200/240epoch时降低10倍学习率,并在280epoch时产出模型, 其余训练超参数详见
mmcv_scrfd.py
【界面设计】
【效果演示】
【实现代码-调用部分】
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Diagnostics;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using OpenCvSharp;
namespace FIRC
{
public partial class Form1 : Form
{
Mat src = new Mat();
CardDetector cd = new CardDetector();
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
OpenFileDialog openFileDialog = new OpenFileDialog();
openFileDialog.Filter = "图文件(*.*)|*.jpg;*.png;*.jpeg;*.bmp";
openFileDialog.RestoreDirectory = true;
openFileDialog.Multiselect = false;
if (openFileDialog.ShowDialog() == DialogResult.OK)
{
src = Cv2.ImRead(openFileDialog.FileName);
pictureBox1.Image = OpenCvSharp.Extensions.BitmapConverter.ToBitmap(src);
}
}
private void button2_Click(object sender, EventArgs e)
{
if(pictureBox1.Image==null)
{
return;
}
Stopwatch sw = new Stopwatch();
sw.Start();
var results =cd.Inference(src);
sw.Stop();
this.Text = "推理耗时" + sw.Elapsed.TotalSeconds;
var resultMat = cd.DrawImage(src.Clone(),results);
var correctImages = cd.GetCorrectImage(src, results);
for(int i=0;i<correctImages.Count;i++)
{
Cv2.ImShow("result",correctImages[i]);
Cv2.WaitKey(0);
}
pictureBox2.Image= OpenCvSharp.Extensions.BitmapConverter.ToBitmap(resultMat); //Mat转Bitmap
}
private void Form1_Load(object sender, EventArgs e)
{
cd.LoadWeights(Application.StartupPath+ "\\weights\\carddetection_scrfd34gkps.onnx");
}
private void button3_Click(object sender, EventArgs e)
{
VideoCapture capture = new VideoCapture(0);
if (!capture.IsOpened())
{
Console.WriteLine("video not open!");
return;
}
Mat frame = new Mat();
var sw = new Stopwatch();
int fps = 0;
while (true)
{
capture.Read(frame);
if (frame.Empty())
{
Console.WriteLine("data is empty!");
break;
}
sw.Start();
var results = cd.Inference(frame);
cd.DrawImage(frame,results);
sw.Stop();
fps = Convert.ToInt32(1 / sw.Elapsed.TotalSeconds);
sw.Reset();
Cv2.PutText(frame, "FPS=" + fps, new OpenCvSharp.Point(30, 30), HersheyFonts.HersheyComplex, 1.0, new Scalar(255, 0, 0), 3);
//显示结果
Cv2.ImShow("Result", frame);
int key = Cv2.WaitKey(10);
if (key == 27)
break;
}
capture.Release();
}
}
}
【测试环境】
vs2019,netframework4.7.2,opencvsharp==4.8.0
【运行演示视频】
bilibili.com/video/BV1CDCpYJEwF/
【参考文献】
[1] modelscope.cn/models/iic/cv_resnet_carddetection_scrfd34gkps/summary
[2] github.com/hpc203/cv_resnet_carddetection_scrfd34gkps-opencv-dnn