【TensorFlow学习笔记】5：variable_scope和name

学习《深度学习之TensorFlow》时的一些实践。

variable_scope

一般的嵌套

上节有学到在嵌套scope中的变量，一般是：

import tensorflow as tf

# with tf.variable_scope("scopeA") as spA:
#     var1 = tf.get_variable("v1", [1])

with tf.variable_scope("scopeB"):
    with tf.variable_scope("scopeA"):
        var3 = tf.get_variable("v3", [1])

print(var3.name)

scopeB/scopeA/v3:0

使不受外层影响

如果在其前便定义了内层scope，并将其传入tf.variable_scope中，即：

import tensorflow as tf

with tf.variable_scope("scopeA") as spA:
    var1 = tf.get_variable("v1", [1])

with tf.variable_scope("scopeB"):
    with tf.variable_scope(spA):
        var3 = tf.get_variable("v3", [1])

print(var3.name)

scopeA/v3:0

此时不受外层variable_scope的影响。

name_scope

运算结点既受到name_scope限制，也受到variable_scope限制，而Variable仅受到variable_scope限制。

注意，变量和运算结点虽然都是Tensor，但它们分别是：

<class 'tensorflow.python.ops.variables.Variable'>
<class 'tensorflow.python.framework.ops.Tensor'>

输出其type就能查看到，所以它们才会对不同的scope有不同的表现。

一般使用

with tf.variable_scope("v"):
    with tf.name_scope("n1"):
        a = tf.get_variable("a", [1])  # Variable
        x = 1.0 + a  # Op
        with tf.name_scope("n2"):
            y = 1.0 + a  # Op
print(a.name, x.op.name, y.op.name, sep='\n')

v/a:0
v/n1/add:0
v/n1/n2/add:0

返回顶层

当为name_scope指定空字符串时，其行为是使其作用域回到顶层，这个比较特殊。

with tf.variable_scope("v"):
    with tf.name_scope("n1"):
        a = tf.get_variable("a", [1])  # Variable
        x = 1.0 + a  # Op
        with tf.name_scope(""):
            y = 1.0 + a  # Op
            b = tf.get_variable("b", [1])  # Variable仅受到variable_scope的限制
print(a.name, x.op.name, y.op.name, b.name, sep='\n')

v/a:0
v/n1/add
add
v/b:0

因为b是一个Variable，仅受到variable_scope的限制，所以这个“回到顶层”对它不奏效。

图的基本操作

图即是一个计算任务，每个T程序默认就带一个计算图。

建立图

import tensorflow as tf

# 在TF默认的图上建立的常量Tensor
c = tf.constant(0.0)
print(c.graph)

<tensorflow.python.framework.ops.Graph object at 0x00000000032B99B0>

# 建立图g,并在它上面建立个常量Tensor
g = tf.Graph()
with g.as_default():
    c1 = tf.constant(0.0)
print(c1.graph)  # 可以通过变量的graph属性获取所在的图
print(g)

<tensorflow.python.framework.ops.Graph object at 0x000000000A611588>
<tensorflow.python.framework.ops.Graph object at 0x000000000A611588>

# 获取默认图,看看默认图是哪个
g2 = tf.get_default_graph()
print(g2)

<tensorflow.python.framework.ops.Graph object at 0x00000000032B99B0>

# 重置默认图,相当于重新建立了一个图
tf.reset_default_graph()  # 使用该函数时必须保证当前图的资源已经全部释放
g3 = tf.get_default_graph()
print(g3)

<tensorflow.python.framework.ops.Graph object at 0x000000000A611550>

获取图中的Tensor

这里是获取其中的常量Tensor，只要根据它的名称就可以将它取出。

import tensorflow as tf

g = tf.Graph()
with g.as_default():
    c = tf.constant(0.0)

print(c.name)
# 通过名称得到对应元素:通过Tensor的名称得到图中的c
t = g.get_tensor_by_name(name="Const:0")
print(c is t)

Const:0
True

不过我暂时还是不理解这样做有什么意义，可能是在某些情形下能访问到图却无法直接得到里面的变量吧。

获取图中的op

注意op是op包下的Tensor的属性，而不是Tensor本身！

import tensorflow as tf

# 两个常量Tensor
a = tf.constant([[1.0, 2.0]])
b = tf.constant([[1.0], [3.0]])
# 定义它们做矩阵乘法的操作
mymul = tf.matmul(a, b, name='mymul')
print(mymul.op.name)  # 注意这里是.op.name
# 因为这个op在默认图里,先获取到默认图
dft_g = tf.get_default_graph()
# 再从默认图里取出来
mymul_op = dft_g.get_operation_by_name(name="mymul")  # 注意这里没有':0'
mymul_tensor = dft_g.get_tensor_by_name(name="mymul:0")
print(mymul is mymul_op)
print(mymul_op is mymul_tensor)
print(mymul is mymul_tensor)

mymul
False
False
True

从这个例子中可以看到，前面定义的mymul看似是一个操作，其实它是一个Tensor而不是op，必须要访问其op属性得到的才是op，而get_operation_by_name得到的就是op，get_tensor_by_name得到的却是Tensor，这不一样，很容易弄混。

获取元素列表

import tensorflow as tf

g = tf.Graph()
with g.as_default():
    c = tf.constant(0.0)
    d = tf.constant(1.1)

ops = g.get_operations()
print(ops)

[<tf.Operation 'Const' type=Const>, <tf.Operation 'Const_1' type=Const>]

得到的是图g中的所有元素。

通过对象获取元素

前面是通过名字获取元素，这里是通过传入对象本身来获取元素，这就更搞不懂为什么了。。不过书上说这个函数有验证和转换的功能，在多线程中有时会用到。

g = tf.Graph()
with g.as_default():
    c1 = tf.constant(0.0)

c1_cpoy = g.as_graph_element(c1)
print(c1 is c1_cpoy)

True