我已经阅读了CNN Tutorial on the TensorFlow,我正在尝试为我的项目使用相同的模型.
现在的问题是数据读取.我有大约25000张图像用于培训,大约5000张用于测试和验证.文件是png格式,我可以读取它们并将它们转换为numpy.ndarray.
教程中的CNN示例使用队列从提供的文件列表中获取记录.我试图通过将我的图像重塑为一维数组并在其前面附加标签值来创建我自己的二进制文件.所以我的数据看起来像这样
[[1,12,34,24,53,...,105,234,102],
[12,112,43,24,52,...,115,244,98],
....
]
上述数组的单行长度为22501,其中第一个元素是标签.
我将文件转储到使用pickle并尝试使用文件读取文件
tf.FixedLengthRecordReader从文件读取为demonstrated in example
我正在做与cifar10_input.py中给出的相同的事情来读取二进制文件并将它们放入记录对象中.
现在,当我从文件中读取标签和图像值不同时.我可以理解这是因为pickle还在二进制文件中转储大括号和括号的额外信息,并且它们更改了固定长度的记录大小.
上面的示例使用文件名并将其传递给队列以获取文件,然后将队列传递给文件中的单个记录.
我想知道我是否可以将上面定义的numpy数组而不是文件名传递给某些阅读器,它可以从该数组而不是文件中逐个获取记录.
解决方法:
使用CNN示例代码使数据工作的最简单方法可能是修改版本的read_cifar10()并使用它:
>写出包含numpy数组内容的二进制文件.
import numpy as np
images_and_labels_array = np.array([[...], ...], # [[1,12,34,24,53,...,102],
# [12,112,43,24,52,...,98],
# ...]
dtype=np.uint8)
images_and_labels_array.tofile("/tmp/images.bin")
此文件类似于CIFAR10数据文件中使用的格式.您可能希望生成多个文件以获得读取并行性.请注意,ndarray.tofile()以行主顺序写入二进制数据而没有其他元数据; pickle数组将添加TensorFlow的解析例程无法理解的特定于Python的元数据.
>编写一个处理记录格式的read_cifar10()的修改版本.
def read_my_data(filename_queue):
class ImageRecord(object):
pass
result = ImageRecord()
# Dimensions of the images in the dataset.
label_bytes = 1
# Set the following constants as appropriate.
result.height = IMAGE_HEIGHT
result.width = IMAGE_WIDTH
result.depth = IMAGE_DEPTH
image_bytes = result.height * result.width * result.depth
# Every record consists of a label followed by the image, with a
# fixed number of bytes for each.
record_bytes = label_bytes + image_bytes
assert record_bytes == 22501 # Based on your question.
# Read a record, getting filenames from the filename_queue. No
# header or footer in the binary, so we leave header_bytes
# and footer_bytes at their default of 0.
reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)
result.key, value = reader.read(filename_queue)
# Convert from a string to a vector of uint8 that is record_bytes long.
record_bytes = tf.decode_raw(value, tf.uint8)
# The first bytes represent the label, which we convert from uint8->int32.
result.label = tf.cast(
tf.slice(record_bytes, [0], [label_bytes]), tf.int32)
# The remaining bytes after the label represent the image, which we reshape
# from [depth * height * width] to [depth, height, width].
depth_major = tf.reshape(tf.slice(record_bytes, [label_bytes], [image_bytes]),
[result.depth, result.height, result.width])
# Convert from [depth, height, width] to [height, width, depth].
result.uint8image = tf.transpose(depth_major, [1, 2, 0])
return result
def distorted_inputs(data_dir, batch_size):
"""[...]"""
filenames = ["/tmp/images.bin"] # Or a list of filenames if you
# generated multiple files in step 1.
for f in filenames:
if not gfile.Exists(f):
raise ValueError('Failed to find file: ' + f)
# Create a queue that produces the filenames to read.
filename_queue = tf.train.string_input_producer(filenames)
# Read examples from files in the filename queue.
read_input = read_my_data(filename_queue)
reshaped_image = tf.cast(read_input.uint8image, tf.float32)
# [...] (Maybe modify other parameters in here depending on your problem.)
考虑到您的起点,这是一个最小的步骤.使用TensorFlow ops进行PNG解码可能更有效,但这将是一个更大的变化.