解决Keras TensorFlow 混编中 trainable=False设置无效问题_Python

这是最近碰到一个问题，先描述下问题：

首先我有一个训练好的模型(例如vgg16)，我要对这个模型进行一些改变，例如添加一层全连接层，用于种种原因，我只能用TensorFlow来进行模型优化,tf的优化器，默认情况下对所有tf.trainable_variables()进行权值更新，问题就出在这，明明将vgg16的模型设置为trainable=False，但是tf的优化器仍然对vgg16做权值更新

以上就是问题描述，经过谷歌百度等等，终于找到了解决办法，下面我们一点一点的来复原整个问题。

trainable=False 无效

首先，我们导入训练好的模型vgg16，对其设置成trainable=False

									from keras.applications import VGG16

									import tensorflow as tf

									from keras import layers

									# 导入模型

									base_mode = VGG16(include_top=False)

									# 查看可训练的变量

									tf.trainable_variables()

									[<tf.Variable 'block1_conv1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref>,

									 <tf.Variable 'block1_conv1/bias:0' shape=(64,) dtype=float32_ref>,

									 <tf.Variable 'block1_conv2/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref>,

									 <tf.Variable 'block1_conv2/bias:0' shape=(64,) dtype=float32_ref>,

									 <tf.Variable 'block2_conv1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref>,

									 <tf.Variable 'block2_conv1/bias:0' shape=(128,) dtype=float32_ref>,

									 <tf.Variable 'block2_conv2/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref>,

									 <tf.Variable 'block2_conv2/bias:0' shape=(128,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv1/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv2/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv2/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv3/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv3/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv2/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv3/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv2/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv3/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block1_conv1_1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref>,

									 <tf.Variable 'block1_conv1_1/bias:0' shape=(64,) dtype=float32_ref>,

									 <tf.Variable 'block1_conv2_1/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref>,

									 <tf.Variable 'block1_conv2_1/bias:0' shape=(64,) dtype=float32_ref>,

									 <tf.Variable 'block2_conv1_1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref>,

									 <tf.Variable 'block2_conv1_1/bias:0' shape=(128,) dtype=float32_ref>,

									 <tf.Variable 'block2_conv2_1/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref>,

									 <tf.Variable 'block2_conv2_1/bias:0' shape=(128,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv1_1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv1_1/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv2_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv2_1/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv3_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv3_1/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv1_1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv1_1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv2_1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv3_1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv1_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv1_1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv2_1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv3_1/bias:0' shape=(512,) dtype=float32_ref>]

									# 设置 trainable=False

									# base_mode.trainable = False似乎也是可以的

									for layer in base_mode.layers:

									  layer.trainable = False

设置好trainable=False后，再次查看可训练的变量，发现并没有变化，也就是说设置无效

# 再次查看可训练的变量
tf.trainable_variables()

									[<tf.Variable 'block1_conv1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref>,

									 <tf.Variable 'block1_conv1/bias:0' shape=(64,) dtype=float32_ref>,

									 <tf.Variable 'block1_conv2/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref>,

									 <tf.Variable 'block1_conv2/bias:0' shape=(64,) dtype=float32_ref>,

									 <tf.Variable 'block2_conv1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref>,

									 <tf.Variable 'block2_conv1/bias:0' shape=(128,) dtype=float32_ref>,

									 <tf.Variable 'block2_conv2/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref>,

									 <tf.Variable 'block2_conv2/bias:0' shape=(128,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv1/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv2/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv2/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv3/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv3/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv2/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv3/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv2/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv3/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block1_conv1_1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref>,

									 <tf.Variable 'block1_conv1_1/bias:0' shape=(64,) dtype=float32_ref>,

									 <tf.Variable 'block1_conv2_1/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref>,

									 <tf.Variable 'block1_conv2_1/bias:0' shape=(64,) dtype=float32_ref>,

									 <tf.Variable 'block2_conv1_1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref>,

									 <tf.Variable 'block2_conv1_1/bias:0' shape=(128,) dtype=float32_ref>,

									 <tf.Variable 'block2_conv2_1/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref>,

									 <tf.Variable 'block2_conv2_1/bias:0' shape=(128,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv1_1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv1_1/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv2_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv2_1/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block3_conv3_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,

									 <tf.Variable 'block3_conv3_1/bias:0' shape=(256,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv1_1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv1_1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv2_1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block4_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block4_conv3_1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv1_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv1_1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv2_1/bias:0' shape=(512,) dtype=float32_ref>,

									 <tf.Variable 'block5_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,

									 <tf.Variable 'block5_conv3_1/bias:0' shape=(512,) dtype=float32_ref>]

解决的办法

解决的办法就是在导入模型的时候建立一个variable_scope，将需要训练的变量放在另一个variable_scope,然后通过tf.get_collection获取需要训练的变量，最后通过tf的优化器中var_list指定需要训练的变量

									from keras import models

									with tf.variable_scope('base_model'):

									  base_model = VGG16(include_top=False, input_shape=(224,224,3))

									with tf.variable_scope('xxx'):

									  model = models.Sequential()

									  model.add(base_model)

									  model.add(layers.Flatten())

									  model.add(layers.Dense(10))

									# 获取需要训练的变量

									trainable_var = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'xxx')

									trainable_var

[<tf.Variable 'xxx_2/dense_1/kernel:0' shape=(25088, 10) dtype=float32_ref>,
<tf.Variable 'xxx_2/dense_1/bias:0' shape=(10,) dtype=float32_ref>]

									# 定义tf优化器进行训练，这里假设有一个loss

									loss = model.output / 2; # 随便定义的，方便演示

									train_step = tf.train.AdamOptimizer().minimize(loss, var_list=trainable_var)