前几天Google的IO大会上发布的ML Kit,ML Kit为端上部署深度学习模型提供了一套完整的解决方案,本地运行、云端都支持。里面本地部署用到的就是Tensorflow lite。
Tensorflow Lite是在Google去年IO大会上发表的,目前Tensorflow Lite也还在不断的完善迭代中。
Tensorflow Lite在Android和iOS上部署官网有比较详细的介绍已经对应的Demo。而对于ARM板子上的部署及测试,官网及网上的资料则相对较少。本文主要描述如何把Tensorflow Lite编译到ARM板子上,并运行相应的Demo。
0.准备工作:在Ubuntu上准备ARM的交叉编译环境
可以通过 apt-get install
sudo apt-get install g++-arm-linux-gnueabihf
sudo apt-get install -y gcc-arm-linux-gnueabihf
1.下载Tensorflow源码
git clone https://github.com/tensorflow/tensorflow
2.下载Tensorflow相关依赖包
先cd到Tensorflow工程的根目录,然后执行下面的代码
./tensorflow/contrib/lite/download_dependencies.sh
3.编译Tensorflow Lite
根据你ARM的平台修改 ./tensorflow/contrib/lite/build_rpi_lib.sh
中的目标编译平台,比如我的目标平台是ARMV8,所以最后需要把 TARGET_ARCH
设为 armv8
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR/../../.."
#change TARGET_ARCH according to your device
CC_PREFIX=arm-linux-gnueabihf- make -j 3 -f tensorflow/contrib/lite/Makefile TARGET=RPI TARGET_ARCH=armv8
修改完成在根目录下运行该脚本,如下:
[首先感谢大家的支持与关注,现在应该重新编辑这篇文章了,这篇文章是很久以前不知在什么地方Copy过来的,很多问题不知怎么解决,现在我用的是KEIL for arm。用过Keil和IAR
./tensorflow/contrib/lite/build_rpi_lib.sh
编译结束,会在 tensorflow/contrib/lite/gen/lib/rpi_armv7
目录下产生libtensorflow-lite.a
4.编译 label_image
第三步的 build_rpi_lib.sh
脚本实际是调用的 ./tensorflow/contrib/lite/Makefile
对Tensorflow Lite
源码进行编译,但是该 Makefile
并不能编译tensorflow/contrib/lite/examples/label_image
目录下的Demo,所以需要修改 Makefile
把l abel_image
的源码配置到Makefile中,修改方式可以参考 Makefile
里对 MINIMAL Demo
的配置。如果你不想自己改,下面是已经修改好的。
# Find where we're running from, so we can store generated files here.
ifeq ($(origin MAKEFILE_DIR), undefined)
MAKEFILE_DIR := $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST))))
endif
# Try to figure out the host system
HOST_OS :=
ifeq ($(OS),Windows_NT)
HOST_OS = WINDOWS
else
UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S),Linux)
HOST_OS := LINUX
endif
ifeq ($(UNAME_S),Darwin)
HOST_OS := OSX
endif
endif
ARCH := $(shell if [[ $(shell uname -m) =~ i[345678]86 ]]; then echo x86_32; else echo $(shell uname -m); fi)
# Where compiled objects are stored.
OBJDIR := $(MAKEFILE_DIR)/gen/obj/
BINDIR := $(MAKEFILE_DIR)/gen/bin/
LIBDIR := $(MAKEFILE_DIR)/gen/lib/
GENDIR := $(MAKEFILE_DIR)/gen/obj/
# Settings for the host compiler.
CXX := $(CC_PREFIX)gcc
CXXFLAGS := --std=c++11 -O3 -DNDEBUG
CC := $(CC_PREFIX)gcc
CFLAGS := -O3 -DNDEBUG
LDOPTS :=
LDOPTS += -L/usr/local/lib
ARFLAGS := -r
INCLUDES := \
-I. \
-I$(MAKEFILE_DIR)/../../../ \
-I$(MAKEFILE_DIR)/downloads/ \
-I$(MAKEFILE_DIR)/downloads/eigen \
-I$(MAKEFILE_DIR)/downloads/gemmlowp \
-I$(MAKEFILE_DIR)/downloads/neon_2_sse \
-I$(MAKEFILE_DIR)/downloads/farmhash/src \
-I$(MAKEFILE_DIR)/downloads/flatbuffers/include \
-I$(GENDIR)
# This is at the end so any globally-installed frameworks like protobuf don't
# override local versions in the source tree.
INCLUDES += -I/usr/local/include
LIBS := \
-lstdc++ \
-lpthread \
-lm \
-lz
# If we're on Linux, also link in the dl library.
ifeq ($(HOST_OS),LINUX)
LIBS += -ldl
endif
include $(MAKEFILE_DIR)/ios_makefile.inc
include $(MAKEFILE_DIR)/rpi_makefile.inc
# This library is the main target for this makefile. It will contain a minimal
# runtime that can be linked in to other programs.
LIB_NAME := libtensorflow-lite.a
LIB_PATH := $(LIBDIR)$(LIB_NAME)
# A small example program that shows how to link against the library.
MINIMAL_PATH := $(BINDIR)minimal
LABEL_IMAGE_PATH :=$(BINDIR)label_image
MINIMAL_SRCS := \
tensorflow/contrib/lite/examples/minimal/minimal.cc
MINIMAL_OBJS := $(addprefix $(OBJDIR), \
$(patsubst %.cc,%.o,$(patsubst %.c,%.o,$(MINIMAL_SRCS))))
LABEL_IMAGE_SRCS := \
tensorflow/contrib/lite/examples/label_image/label_image.cc \
tensorflow/contrib/lite/examples/label_image/bitmap_helpers.cc
LABEL_IMAGE_OBJS := $(addprefix $(OBJDIR), \
$(patsubst %.cc,%.o,$(patsubst %.c,%.o,$(LABEL_IMAGE_SRCS))))
# What sources we want to compile, must be kept in sync with the main Bazel
# build files.
CORE_CC_ALL_SRCS := \
$(wildcard tensorflow/contrib/lite/*.cc) \
$(wildcard tensorflow/contrib/lite/kernels/*.cc) \
$(wildcard tensorflow/contrib/lite/kernels/internal/*.cc) \
$(wildcard tensorflow/contrib/lite/kernels/internal/optimized/*.cc) \
$(wildcard tensorflow/contrib/lite/kernels/internal/reference/*.cc) \
$(wildcard tensorflow/contrib/lite/*.c) \
$(wildcard tensorflow/contrib/lite/kernels/*.c) \
$(wildcard tensorflow/contrib/lite/kernels/internal/*.c) \
$(wildcard tensorflow/contrib/lite/kernels/internal/optimized/*.c) \
$(wildcard tensorflow/contrib/lite/kernels/internal/reference/*.c) \
$(wildcard tensorflow/contrib/lite/downloads/farmhash/src/farmhash.cc) \
$(wildcard tensorflow/contrib/lite/downloads/fft2d/fftsg.c)
# Remove any duplicates.
CORE_CC_ALL_SRCS := $(sort $(CORE_CC_ALL_SRCS))
CORE_CC_EXCLUDE_SRCS := \
$(wildcard tensorflow/contrib/lite/*test.cc) \
$(wildcard tensorflow/contrib/lite/*/*test.cc) \
$(wildcard tensorflow/contrib/lite/*/*/*test.cc) \
$(wildcard tensorflow/contrib/lite/*/*/*/*test.cc) \
$(wildcard tensorflow/contrib/lite/kernels/test_util.cc) \
$(MINIMAL_SRCS) \
$(LABEL_IMAGE_SRCS)
# Filter out all the excluded files.
TF_LITE_CC_SRCS := $(filter-out $(CORE_CC_EXCLUDE_SRCS), $(CORE_CC_ALL_SRCS))
# File names of the intermediate files target compilation generates.
TF_LITE_CC_OBJS := $(addprefix $(OBJDIR), \
$(patsubst %.cc,%.o,$(patsubst %.c,%.o,$(TF_LITE_CC_SRCS))))
LIB_OBJS := $(TF_LITE_CC_OBJS)
# For normal manually-created TensorFlow C++ source files.
$(OBJDIR)%.o: %.cc
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(INCLUDES) -c $< -o $@
# For normal manually-created TensorFlow C++ source files.
$(OBJDIR)%.o: %.c
@mkdir -p $(dir $@)
$(CC) $(CCFLAGS) $(INCLUDES) -c $< -o $@
# The target that's compiled if there's no command-line arguments.
all: $(LIB_PATH) $(MINIMAL_PATH) $(LABEL_IMAGE_PATH)
# Gathers together all the objects we've compiled into a single '.a' archive.
$(LIB_PATH): $(LIB_OBJS)
@mkdir -p $(dir $@)
$(AR) $(ARFLAGS) $(LIB_PATH) $(LIB_OBJS)
$(MINIMAL_PATH): $(MINIMAL_OBJS) $(LIB_PATH)
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(INCLUDES) \
-o $(MINIMAL_PATH) $(MINIMAL_OBJS) \
$(LIBFLAGS) $(LIB_PATH) $(LDFLAGS) $(LIBS)
$(LABEL_IMAGE_PATH): $(LABEL_IMAGE_OBJS) $(LIB_PATH)
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(INCLUDES) \
-o $(LABEL_IMAGE_PATH) $(LABEL_IMAGE_OBJS) \
$(LIBFLAGS) $(LIB_PATH) $(LDFLAGS) $(LIBS)
# Gets rid of all generated files.
clean:
rm -rf $(MAKEFILE_DIR)/gen
# Gets rid of target files only, leaving the host alone. Also leaves the lib
# directory untouched deliberately, so we can persist multiple architectures
# across builds for iOS and Android.
cleantarget:
rm -rf $(OBJDIR)
rm -rf $(BINDIR)
$(DEPDIR)/%.d: ;
.PRECIOUS: $(DEPDIR)/%.d
-include $(patsubst %,$(DEPDIR)/%.d,$(basename $(TF_CC_SRCS)))
修改完成后再次执行 ./tensorflow/contrib/lite/build_rpi_lib.sh
,此时在./tensorflow/contrib/lite/gen/bin/rpi_armv8
5.拷贝程序到板子上
准备测试图片tensorflow/contrib/lite/examples/label_image/testdata/grace_hopper.bmp
,当然用其他的图片测试也行。此外,还需要从https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/models.md地址下载你想要测试的tf lite模型。还需要准备ImageNet的标签文件。最后需要传到板子上的是如下几个文件。
grace_hopper.bmp
label_image
labels.txt
mobilenet_quant_v1_224.tflite
6.在ARM板子上运行Tensorflow Lite
到此准备工作全部完成了,最后可以在板子上测试Tensorflow Lite的性能了,使用姿势如下
./label_image -v 1 -m ./mobilenet_quant_v1_224.tflite -i ./grace_hopper.bmp -l ./labels.txt
运行效果如下,从时间上看Tensorflow lite性能赶不上NCNN啊,测试为ARM A72,期待Tensorflow Lite后续更好的改进。
average time: 1036.77 ms
0.364706: 907 n04591713 wine bottle
0.364706: 653 n03764736 milk can
0.0431373: 668 n03788195 mosque
0.0352941: 458 n02892201 brass, memorial tablet, plaque
0.027451: 543 n03255030 dumbbell
Reference:https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/rpi.md
[对于Android for arm上的so注入(inject)和挂钩(hook),网上已有牛人给出了代码-libinject(http://bbs.pediy.com/showthread.phpt=141355)。由于实现中的ptrace函数