Keras 在fit_generator训练方式中加入图像random_crop操作_Python

使用Keras作前端写网络时，由于训练图像尺寸较大，需要做类似 tf.random_crop 图像裁剪操作。

为此研究了一番Keras下已封装的API。

Data Augmentation（数据扩充）

Data Aumentation 指使用下面或其他方法增加输入数据量。我们默认图像数据。

旋转&反射变换(Rotation/reflection): 随机旋转图像一定角度; 改变图像内容的朝向;

翻转变换(flip): 沿着水平或者垂直方向翻转图像;

缩放变换(zoom): 按照一定的比例放大或者缩小图像;

平移变换(shift): 在图像平面上对图像以一定方式进行平移;

可以采用随机或人为定义的方式指定平移范围和平移步长, 沿水平或竖直方向进行平移. 改变图像内容的位置;

尺度变换(scale): 对图像按照指定的尺度因子, 进行放大或缩小; 或者参照SIFT特征提取思想, 利用指定的尺度因子对图像滤波构造尺度空间. 改变图像内容的大小或模糊程度;

对比度变换(contrast): 在图像的HSV颜色空间，改变饱和度S和V亮度分量，保持色调H不变. 对每个像素的S和V分量进行指数运算(指数因子在0.25到4之间), 增加光照变化;

噪声扰动(noise): 对图像的每个像素RGB进行随机扰动, 常用的噪声模式是椒盐噪声和高斯噪声;

Data Aumentation 有很多好处，比如数据量较少时，用数据扩充来增加训练数据，防止过拟合。

ImageDataGenerator

在Keras中，ImageDataGenerator就是专门做数据扩充的。

from keras.preprocessing.image import ImageDataGenerator

注：Using TensorFlow backend.

官方写法如下：

									(x_train, y_train), (x_test, y_test) = cifar10.load_data()

									datagen = ImageDataGenerator(

									 featurewise_center=True,

									 ...

									 horizontal_flip=True)

									# compute quantities required for featurewise normalization

									datagen.fit(x_train)

									# 使用fit_generator的【自动】训练方法: fits the model on batches with real-time data augmentation

									model.fit_generator(datagen.flow(x_train, y_train, batch_size=32),

									   steps_per_epoch=len(x_train), epochs=epochs)

									# 自己写range循环的【手动】训练方法

									for e in range(epochs):

									 print 'Epoch', e

									 batches = 0

									 for x_batch, y_batch in datagen.flow(x_train, y_train, batch_size=32):

									 loss = model.train(x_batch, y_batch)

									 batches += 1

									 if batches >= len(x_train) / 32:

									  # we need to break the loop by hand because

									  # the generator loops indefinitely

									  break

ImageDataGenerator的参数说明见官网文档。

上面两种训练方法的差异不讨论，我们要关注的是：官方封装的训练集batch生成器是ImageDataGenerator对象的flow方法(或flow_from_directory)，该函数返回一个和python定义相似的generator。在它前一步，数据变换是ImageDataGenerator对象的fit方法。

random_crop并未在ImageDataGenerator中内置，但参数中给了一个preprocessing_function，我们可以利用它自定义my_random_crop函数，像下面这样写：

									def my_random_crop(image):

									 random_arr = numpy.random.randint(img_sz-crop_sz+1, size=2)

									 y = int(random_arr[0])

									 x = int(random_arr[1])

									 h = img_crop

									 w = img_crop

									 image_crop = image[y:y+h, x:x+w, :]

									 return image_crop

									datagen = ImageDataGenerator(

									 featurewise_center=False,

									 ···

									 preprocessing_function=my_random_crop)

									datagen.fit(x_train)

fit方法调用时将预设的变换应用到x_train的每张图上，包括图像crop，因为是单张依次处理，每张图的crop位置随机。

在训练数据(x=image, y=class_label)时这样写已满足要求;

但在(x=image, y=image_mask)时该方法就不成立了。图像单张处理的缘故，一对(image, image_mask)分别crop的位置无法保持一致。

虽然官网也给出了同时变换image和mask的写法，但它提出的方案能保证二者内置函数的变换一致，自定义函数的random变量仍是随机的。

fit_generator

既然ImageDataGenerator和flow方法不能满足我们的random_crop预处理要求，就在fit_generator函数处想方法修改。

先看它的定义：

									def fit_generator(self, generator, samples_per_epoch, nb_epoch,

									   verbose=1, callbacks=[],

									   validation_data=None, nb_val_samples=None,

									   class_weight=None, max_q_size=10, **kwargs):

第一个参数generator，可以传入一个方法，也可以直接传入数据集。前面的 datagen.flow() 即是Keras封装的批量数据传入方法。

显然，我们可以自定义。

									def generate_batch_data_random(x, y, batch_size):

									 """分批取batch数据加载到显存"""

									 total_num = len(x)

									 batches = total_num // batch_size

									 while (True):

									 i = randint(0, batches)

									 x_batch = x[i*batch_size:(i+1)*batch_size]

									 y_batch = y[i*batch_size:(i+1)*batch_size]

									 random_arr = numpy.random.randint(img_sz-crop_sz+1, size=2)

									 y_pos = int(random_arr[0])

									 x_pos = int(random_arr[1])

									 x_crop = x_batch[:, y_pos:y_pos+crop_sz, x_pos:x_pos+crop_sz, :]

									 y_crop = y_batch[:, y_pos:y_pos+crop_sz, x_pos:x_pos+crop_sz, :]

									 yield (x_crop, y_crop)

这样写就符合我们同组image和mask位置一致的random_crop要求。

注意：

由于没有使用ImageDataGenerator内置的数据变换方法，数据扩充则也需要自定义；由于没有使用flow(…, shuffle=True,)方法，每个epoch的数据打乱需要自定义。

generator自定义时要写成死循环，因为在每个epoch内，generate_batch_data_random是不会重复调用的。

补充知识：tensorflow中的随机裁剪函数random_crop

tf.random_crop是tensorflow中的随机裁剪函数，可以用来裁剪图片。我采用如下图片进行随机裁剪，裁剪大小为原图的一半。

Keras 在fit_generator训练方式中加入图像random_crop操作

如下是实验代码

									import tensorflow as tf

									import matplotlib.image as img

									import matplotlib.pyplot as plt

									sess = tf.InteractiveSession()

									image = img.imread('D:/Documents/Pictures/logo3.jpg')

									reshaped_image = tf.cast(image,tf.float32)

									size = tf.cast(tf.shape(reshaped_image).eval(),tf.int32)

									height = sess.run(size[0]//2)

									width = sess.run(size[1]//2)

									distorted_image = tf.random_crop(reshaped_image,[height,width,3])

									print(tf.shape(reshaped_image).eval())

									print(tf.shape(distorted_image).eval())

									fig = plt.figure()

									fig1 = plt.figure()

									ax = fig.add_subplot(111)

									ax1 = fig1.add_subplot(111)

									ax.imshow(sess.run(tf.cast(reshaped_image,tf.uint8)))

									ax1.imshow(sess.run(tf.cast(distorted_image,tf.uint8)))

									plt.show()