参考“解读 Keras 在 ImageNet 中的应用:详解 5 种主要的图像识别模型”一文,在自己的电脑环境上尝试识别本地图片。

步骤1

按照以下网址的方法尝试实际操作

先是在终端命令行操作,发现没有cv2模块,告警如下

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named cv2

简单搜索后按照以下命令可以修复问题

pip2 install opencv-python

安装完后重启python可以正常导入cv2

步骤2

使用命令行解析的模块时会直接退出

args = vars(ap.parse_args())
usage: [-h] -i IMAGE [-model MODEL]
: error: argument -i/--image is required

简单搜索后,怀疑args = vars(ap.parse_args())只能用于脚本,不能用于交互界面,后面都采用脚本方式执行,完整脚本整理如下:

import keras
keras.__version__

# import the necessary packages
from keras.applications import ResNet50
from keras.applications import InceptionV3
from keras.applications import Xception # TensorFlow ONLY
from keras.applications import VGG16
from keras.applications import VGG19
from keras.applications import imagenet_utils
from keras.applications.inception_v3 import preprocess_input
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import load_img
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,help="path to the input image")
ap.add_argument("-model", "--model", type=str, default="vgg16",help="name of pre-trained network to use")
args = vars(ap.parse_args()) 

# define a dictionary that maps model names to their classes
# inside Keras
MODELS = {
	"vgg16": VGG16,
	"vgg19": VGG19,
	"inception": InceptionV3,
	"xception": Xception, # TensorFlow ONLY
	"resnet": ResNet50
	}
 
# esnure a valid model name was supplied via command line argument
if args["model"] not in MODELS.keys():
	raise AssertionError("The --model command line argument should be a key in the `MODELS` dictionary")

# initialize the input image shape (224x224 pixels) along with
# the pre-processing function (this might need to be changed
# based on which model we use to classify our image)
inputShape = (224, 224)
preprocess = imagenet_utils.preprocess_input
 
# if we are using the InceptionV3 or Xception networks, then we
# need to set the input shape to (299x299) [rather than (224x224)]
# and use a different image processing function
if args["model"] in ("inception", "xception"):
	inputShape = (299, 299)
	preprocess = preprocess_input

# load our the network weights from disk (NOTE: if this is the
# first time you are running this script for a given network, the
# weights will need to be downloaded first -- depending on which
# network you are using, the weights can be 90-575MB, so be
# patient; the weights will be cached and subsequent runs of this
# script will be *much* faster)
print("[INFO] loading {}...".format(args["model"]))
Network = MODELS[args["model"]]
model = Network(weights="imagenet")

# load the input image using the Keras helper utility while ensuring
# the image is resized to `inputShape`, the required input dimensions
# for the ImageNet pre-trained network
print("[INFO] loading and pre-processing image...")
image = load_img(args["image"], target_size=inputShape)
image = img_to_array(image)
 
# our input image is now represented as a NumPy array of shape
# (inputShape[0], inputShape[1], 3) however we need to expand the
# dimension by making the shape (1, inputShape[0], inputShape[1], 3)
# so we can pass it through thenetwork
image = np.expand_dims(image, axis=0)
 
# pre-process the image using the appropriate function based on the
# model that has been loaded (i.e., mean subtraction, scaling, etc.)
image = preprocess(image)

# classify the image
print("[INFO] classifying image with '{}'...".format(args["model"]))
preds = model.predict(image)
P = imagenet_utils.decode_predictions(preds)
 
# loop over the predictions and display the rank-5 predictions +
# probabilities to our terminal
for (i, (imagenetID, label, prob)) in enumerate(P[0]):
	print("{}. {}: {:.2f}%".format(i + 1, label, prob * 100))

# load the image via OpenCV, draw the top prediction on the image,
# and display the image to our screen
orig = cv2.imread(args["image"])
(imagenetID, label, prob) = P[0][0]
cv2.putText(orig, "Label: {}, {:.2f}%".format(label, prob * 100),
(10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 2)
cv2.imshow("Classification", orig)
cv2.waitKey(0)

步骤3

执行以下语句后

python classify_image.py --image images/soccer_ball.jpg --model vgg16

会自动下载vgg16/19的模型权重,有500M+,源代码中是直接从github下载,非常慢,后来搜索到有云盘资源,下载后发现是npy文件,但仔细搜离线的权重文件都是h5文件,无法得知两者关系,于是打算直接从github下载

步骤4

自动下载完vgg16_weights_tf_dim_ordering_tf_kernels.h5权重,实际上keras可用的模型包括

MODELS = {
"vgg16": VGG16,
"vgg19": VGG19,
"inception": InceptionV3,
"xception": Xception, # TensorFlow ONLY
"resnet": ResNet50
}

不同的模型支持的分类都一样,都是基于imageNet的1000种分类的数据库预训练得来的,这些分类的中文标签如下网址

步骤5

然后尝试用下载好的预训练模型进行识别,开始由于图片路径无效,告警,更换路径后执行以下语句

python classify_image.py --image imagenet_vgg16_soccer_ball.jpg --model vgg16

可以成功执行图片识别,一个小细节,最后执行的opencv语句cv2.waitKey(0),需要在图片上敲回车,否则命令行会一直等,及时关掉图片也无法退出,只能手动关终端,效果如下

resnet50模型预训练权重_权重

resnet50模型预训练权重_深度学习_02

步骤6

为了尝试其他模型,把前文的命令替换以下模型如下

python classify_image.py --image imagenet_vgg16_soccer_ball.jpg --model vgg19

不过等待下载的时间都是动不动就2小时的,在等待期间,新开另一窗口用vgg16来识别其他的图片

python classify_image.py --image test/a1.jpg --model vgg16

步骤7

由于下载时间过长,改用浏览器下载模型权值,下载后本地路径的位置为~/.keras/models
下载的链接如下文,但由于是境外网站,网速不稳定,有时候能上4Mb/s,有时候只有几十k

下载完其他模型,再次尝试
python classify_image.py --image a1.jpg --model vgg19
对加勒比海盗的识别依旧很差,换了几个模型都差不多,xception是最好的