一.前言
本文根据PaddleOCR官方文档实践部署进行了图像的文字识别。由于相比基于python的服务部署方式,C++ 服务部署性能更优。以下部署过程采用基于PaddleServing v0.9.0 的C++部署方式,使用serving开发镜像(CPU),PaddleOCR版本为2.6.0。
二.部署过程
1.部署基础环境准备
使用docker-compose的方式启动serving开发镜像(CPU)。
docker-compose.yml文件如下:
version: "3.7"
networks:
paddle:
name: server-cpu
driver: bridge
services:
paddle-server-cpu:
image: registry.baidubce.com/paddlepaddle/serving:0.9.0-devel
container_name: paddle-server-cpu
restart: always
ports:
- 9292:9292
- 8181:8181
command: 'bash'
stdin_open: true
tty: true
networks:
paddle:
aliases:
- paddle-server-cpu
# 启动docker容器
docker-compose up -d
# 进入容器。后续所有的操作均在容器中进行。
docker exec -it paddle-server-cpu bash
# 下载PaddleOCR代码库
git clone https://github.com/PaddlePaddle/PaddleOCR
# 进入到工作目录
cd PaddleOCR/deploy/pdserving
2.编译PaddleServing
2.1.Serving代码准备
# 下载Serving代码库
git clone https://github.com/PaddlePaddle/Serving
# 将OCR文本检测预处理相关代码替换到Serving库中
# 会提示cp: overwrite 'Serving/core/general-server/op/general_detection_op.cpp'?,输入y并回车
cp -rf general_detection_op.cpp Serving/core/general-server/op
cd Serving
git submodule update --init --recursive
2.2.安装OpenCV库
# 根据文档介绍"需要开启 WITH_OPENCV 选项",所以首先编译安装OpenCV库
wget https://github.com/opencv/opencv/archive/3.4.7.tar.gz
tar -xf 3.4.7.tar.gz
cd opencv-3.4.7
root_path=/home/PaddleOCR/deploy/pdserving/Serving/opencv-3.4.7
install_path=${root_path}/opencv3
rm -rf build
mkdir build
cd build
cmake .. \
-DCMAKE_INSTALL_PREFIX=${install_path} \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=OFF \
-DWITH_IPP=OFF \
-DBUILD_IPP_IW=OFF \
-DWITH_LAPACK=OFF \
-DWITH_EIGEN=OFF \
-DCMAKE_INSTALL_LIBDIR=lib64 \
-DWITH_ZLIB=ON \
-DBUILD_ZLIB=ON \
-DWITH_JPEG=ON \
-DBUILD_JPEG=ON \
-DWITH_PNG=ON \
-DBUILD_PNG=ON \
-DWITH_TIFF=ON \
-DBUILD_TIFF=ON
make -j10
make install
最终在opencv安装目录下的文件结构如下
2.3.环境变量配置
cd /home/PaddleOCR/deploy/pdserving/Serving
# 环境变量准备(不同版本的Serving开发镜像的路径可能不同)
export PYTHON_INCLUDE_DIR=/usr/local/include/python3.7m/
export PYTHON_LIBRARIES=/usr/local/lib/libpython3.7m.so
export PYTHON_EXECUTABLE=/usr/local/bin/python3.7
export GOPATH=$HOME/go
export PATH=$PATH:$GOPATH/bin
python3.7 -m pip install -i https://mirror.baidu.com/pypi/simple -r python/requirements.txt
go env -w GO111MODULE=on
go env -w GOPROXY=https://goproxy.cn,direct
go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
go install google.golang.org/grpc@v1.33.0
go env -w GO111MODULE=auto
2.4.编译PaddleServing
# 编译paddle-serving-server
mkdir build_server
cd build_server
OPENCV_DIR=/home/PaddleOCR/deploy/pdserving/Serving/opencv-3.4.7/opencv3
cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-DOPENCV_DIR=${OPENCV_DIR} \
-DWITH_OPENCV=ON \
-DSERVER=ON \
-DWITH_GPU=OFF ..
# 编译过程可能因为网络连接的原因失败,多重试几次
make -j10
cd ..
# 编译paddle-serving-client
mkdir build_client
cd build_client
cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-DCLIENT=ON ..
make -j10
编译paddle-serving-client时出现"ModuleNotFoundError: No module named ‘requests’"错误则执行以下命令
python3.7 -m pip install -i https://mirror.baidu.com/pypi/simple requests
cd ..
# 编译paddle-serving-app
mkdir build_app
cd build_app
cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-DAPP=ON ..
make -j10
cd ..
2.5.安装PaddleServing
# 安装相关whl包
pip3.7 install -i https://mirror.baidu.com/pypi/simple build_server/python/dist/*.whl
pip3.7 install -i https://mirror.baidu.com/pypi/simple build_client/python/dist/*.whl
pip3.7 install -i https://mirror.baidu.com/pypi/simple build_app/python/dist/*.whl
export SERVING_BIN=${PWD}/build_server/core/general-server/serving
3.模型转化
使用PaddleServing做服务化部署时,需要将保存的inference模型转换为serving易于部署的模型。
# 进入到工作目录
cd /home/PaddleOCR/deploy/pdserving
# 下载并解压 OCR 文本检测模型
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar -O ch_PP-OCRv3_det_infer.tar && tar -xf ch_PP-OCRv3_det_infer.tar
# 下载并解压 OCR 文本识别模型
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar -O ch_PP-OCRv3_rec_infer.tar && tar -xf ch_PP-OCRv3_rec_infer.tar
# 转换检测模型
python3.7 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv3_det_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./ppocr_det_v3_serving/ \
--serving_client ./ppocr_det_v3_client/
进行转换时如果出现"ModuleNotFoundError: No module named ‘paddle’"错误则执行以下命令
python3.7 -m pip install -i https://mirror.baidu.com/pypi/simple paddlepaddle
# 转换识别模型
python3.7 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv3_rec_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./ppocr_rec_v3_serving/ \
--serving_client ./ppocr_rec_v3_client/
4.启动PaddleServing服务测试图像文字识别
# 启动服务,运行日志保存在log.txt
python3.7 -m paddle_serving_server.serve --model ppocr_det_v3_serving ppocr_rec_v3_serving --op GeneralDetectionOp GeneralInferOp --port 8181 &>log.txt &
启动成功后在log.txt中会显示
# ppocr_det_v3_client/serving_client_conf.prototxt 中 feed_type 字段 和 shape 字段,修改成如下内容:
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 20
shape: 1
}
# 测试图像文字识别
python3.7 ocr_cpp_client.py ppocr_det_v3_client ppocr_rec_v3_client
图片文字识别内容如下,查看ocr_cpp_client.py脚本可知识别的图片为/home/PaddleOCR/doc/imgs/1.jpg。至此整个实践部署完成。
三.参考资料
主要参考官方文档地址如下:
- https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/deploy/pdserving/README_CN.md
- https://github.com/PaddlePaddle/Serving/blob/develop/doc/Install_CN.md
- https://github.com/PaddlePaddle/Serving/blob/v0.9.0/doc/Compile_CN.md