TensorFlow入門

netouch 2024-04-10 發(fā)布于北京

展開全文

一、步驟：

①準備數(shù)據(jù)：采集大量“特征/標簽”數(shù)據(jù)
②搭建網(wǎng)絡(luò)：搭建神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)
③優(yōu)化參數(shù)：訓(xùn)練網(wǎng)絡(luò)獲取最佳參數(shù)（反傳）
④應(yīng)用網(wǎng)絡(luò)：將網(wǎng)絡(luò)保存為模型，輸入新數(shù)據(jù)，輸出分類或預(yù)測結(jié)果（前傳）

二、鳶尾花分類

1、采集大量數(shù)據(jù)對（花萼長、花萼寬、花瓣長、花瓣寬，對應(yīng)的類別）構(gòu)成數(shù)據(jù)集。輸入特征標簽
把數(shù)據(jù)集喂入搭建好的神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)，網(wǎng)絡(luò)優(yōu)化參數(shù)得到模型，模型讀入新輸入特征，輸出識別結(jié)果。
2、輸入層四個節(jié)點（四個輸入特征），輸出層三個節(jié)點（三種類別）。
3、前向傳播：，其中y為1 x 3的輸出矩陣，x為1 x 4的輸入特征矩陣，w為 4 x 3的權(quán)重矩陣，b為1 x 3的偏置項矩陣（3個輸出層節(jié)點）。
4、損失函數(shù)loss：預(yù)測值y與標準答案y_的差距?？梢远颗袛鄔和b的優(yōu)劣，當損失函數(shù)輸出最小時，參數(shù)w、b會出現(xiàn)最優(yōu)值。
5、均方誤差
在這里插入圖片描述
6、梯度下降：
目的：找到一組參數(shù)w和b，使得損失函數(shù)最小。
梯度：函數(shù)對各參數(shù)求偏導(dǎo)后的向量。函數(shù)梯度下降的方向是函數(shù)減小的方向。
梯度下降法：沿損失函數(shù)梯度下降的方向，尋找損失函數(shù)的最小值，得到最優(yōu)參數(shù)。
學(xué)習率lr：當學(xué)習率設(shè)置的過小時，收斂過程將變得十分緩慢。過大時，梯度可能會在最小值附近來回震蕩，甚至無法收斂。
在這里插入圖片描述
反向傳播：從后向前，逐層求損失函數(shù)對每層神經(jīng)元參數(shù)的偏導(dǎo)數(shù)，迭代更新所有參數(shù)。

代碼如下：

import tensorflow as tf

w = tf.Variable(tf.constant(value=[5], shape=(1, 1), dtype=tf.dtypes.float32)) # 生成權(quán)重張量，tf.Variable設(shè)定為可訓(xùn)練
lr = 0.2 # 初始化學(xué)習率
epoch = 50 # 迭代的輪數(shù)

for i in range(epoch):
    with tf.GradientTape() as tape: # 梯度計算框架
        loss = tf.square(w + 1) # 損失函數(shù)，square為平方函數(shù)
        grads = tape.gradient(loss, w) # 梯度：損失函數(shù)loss對w求導(dǎo)
        pass
    w.assign_sub(lr*grads) # 自減，相當于: w -= lr*grads
    print('After %s epoch,w is %f,loss is %f' % (epoch, w.numpy(), loss)) # w.numpy()將張量形式轉(zhuǎn)化為numpy格式，相關(guān)代碼如下。
    pass

a = tf.Variable(tf.constant([[1, 2, 3], [2, 3, 4]], shape=[2, 3], dtype=tf.dtypes.float32))
print(a)
# <tf.Variable 'Variable:0' shape=(2, 3) dtype=float32, numpy=
# array([[1., 2., 3.],
#        [2., 3., 4.]], dtype=float32)>
print(a.numpy())
# [[1. 2. 3.]
#  [2. 3. 4.]]
print(type(a.numpy()))
# <class 'numpy.ndarray'>

三、常用函數(shù)

1、創(chuàng)建張量

a = tf.Variable(tf.constant(5.0))
b = tf.Variable(tf.constant([5.0]))
c = tf.Variable(tf.constant([[5.0]]))
d = tf.Variable(tf.constant([5.0], shape=(1, 1)))
e = tf.Variable(tf.constant([5.0], shape=(1)))
# 注意差別
print(a) # <tf.Variable 'Variable:0' shape=() dtype=float32, numpy=5.0>
print(b) # <tf.Variable 'Variable:0' shape=(1,) dtype=float32, numpy=array([5.], dtype=float32)>
print(c) # <tf.Variable 'Variable:0' shape=(1, 1) dtype=float32, numpy=array([[5.]], dtype=float32)>
print(d) # <tf.Variable 'Variable:0' shape=(1, 1) dtype=float32, numpy=array([[5.]], dtype=float32)>
print(e) # <tf.Variable 'Variable:0' shape=(1,) dtype=float32, numpy=array([5.], dtype=float32)>

2、類型轉(zhuǎn)換及最值

a = tf.constant([1, 2], shape=(1, 2), dtype=tf.dtypes.float32, name='a')

# 強制轉(zhuǎn)換類型
b = tf.cast(x=a, dtype=tf.dtypes.int32) # tf.cast(張量名, dtype=數(shù)據(jù)類型)
print(a) # tf.Tensor([[1. 2.]], shape=(1, 2), dtype=float32)
print(b) # tf.Tensor([[1 2]], shape=(1, 2), dtype=int32)

# 計算張量維度上元素的最小值
print(tf.reduce_min(input_tensor=a)) # tf.Tensor(1.0, shape=(), dtype=float32)

# 計算張量維度上元素的最大值
print(tf.reduce_max(input_tensor=a)) # tf.Tensor(2.0, shape=(), dtype=float32)

3、參數(shù)axis：在一個二維張量中，可以通過調(diào)整axis等與0或1控制執(zhí)行維度。axis=0代表跨行（經(jīng)度，down），axis=1代表跨列（緯度，across），不指定的話則所有元素參與計算。

a = tf.constant([[1, 2, 3],
                 [2, 3, 4]], dtype=tf.dtypes.float32, shape=(2, 3), name='a')

# 計算張量沿指定維度的平均值
b = tf.reduce_mean(input_tensor=a, axis=0)
c = tf.reduce_mean(input_tensor=a, axis=1)
print(b) # tf.Tensor([1.5 2.5 3.5], shape=(3,), dtype=float32)
print(c) # tf.Tensor([2. 3.], shape=(2,), dtype=float32)

# 計算張量沿指定維度的和
d = tf.reduce_sum(input_tensor=a, axis=0)
e = tf.reduce_sum(input_tensor=a, axis=1)
print(d) # tf.Tensor([3. 5. 7.], shape=(3,), dtype=float32)
print(e) # tf.Tensor([6. 9.], shape=(2,), dtype=float32)

4、tf.Variable()函數(shù)將變量標記為“可訓(xùn)練”，被標記的變量會在反向傳播中記錄梯度信息。神經(jīng)網(wǎng)絡(luò)訓(xùn)練中，常用該函數(shù)標記待訓(xùn)練參數(shù)。
5、數(shù)學(xué)運算：

a = tf.constant(value=[1, 2, 3], dtype=tf.dtypes.float32, shape=(1, 3), name='a')
b = tf.constant(value=[2, 3, 4], dtype=tf.dtypes.float32, shape=(1, 3), name='b')

# 對應(yīng)元素四則運算（維度相同）:加、減、乘、除
print(tf.add(x=a, y=b, name='add')) # tf.Tensor([[3. 5. 7.]], shape=(1, 3), dtype=float32)
print(tf.subtract(x=a, y=b, name='subtract')) # tf.Tensor([[-1. -1. -1.]], shape=(1, 3), dtype=float32)
print(tf.multiply(x=a, y=b, name='multiply')) # tf.Tensor([[ 2.  6. 12.]], shape=(1, 3), dtype=float32)
print(tf.divide(x=a, y=b, name='divide')) # tf.Tensor([[0.5       0.6666667 0.75     ]], shape=(1, 3), dtype=float32)

# 平方、n次方與開方
c = tf.fill(dims=(1, 2), value=2., name='c') # 生成1 x 2，值為2的張量
print(tf.square(x=c, name='square')) # tf.Tensor([[4. 4.]], shape=(1, 2), dtype=float32)
print(tf.pow(x=c, y=3, name='pow')) # tf.Tensor([[8. 8.]], shape=(1, 2), dtype=float32)
print(tf.sqrt(x=c, name='sqrt')) # tf.Tensor([[1.4142135 1.4142135]], shape=(1, 2), dtype=float32)

# 矩陣相乘
e = tf.constant([[2, 3, 4],
                 [3, 4, 5]], shape=(2, 3), dtype=tf.dtypes.float32, name='e')
f = tf.constant([[2, 3],
                 [3, 4],
                 [4, 5]], shape=(3, 2), dtype=tf.dtypes.float32, name='f')
print(tf.matmul(a=e, b=f, name='matmul'))
# tf.Tensor(
# [[29. 38.]
#  [38. 50.]], shape=(2, 2), dtype=float32)

6、神經(jīng)網(wǎng)絡(luò)在訓(xùn)練時，是把輸入特征和標簽配對后喂入網(wǎng)絡(luò)的。tf.data.Dataset.from_tensor_slices()函數(shù)，切分傳入張量的第一維度，生成輸入特征/標簽對，構(gòu)建數(shù)據(jù)集data=tf.data.Dataset.from_tensor_slices((輸入特征, 輸出特征))，且NumPy格式和Tensor格式都可用該語句讀入數(shù)據(jù)。

features = tf.constant([12, 23, 10, 17], dtype=tf.dtypes.float32, name='features')
labels = tf.constant([0, 1, 1, 0], dtype=tf.dtypes.int32, name='labels')

datasets = tf.data.Dataset.from_tensor_slices((features, labels))
print(datasets) # <TensorSliceDataset shapes: ((), ()), types: (tf.float32, tf.int32)>

for element in datasets:
    print(element)
    pass
# (<tf.Tensor: shape=(), dtype=float32, numpy=12.0>, <tf.Tensor: shape=(), dtype=int32, numpy=0>)
# (<tf.Tensor: shape=(), dtype=float32, numpy=23.0>, <tf.Tensor: shape=(), dtype=int32, numpy=1>)
# (<tf.Tensor: shape=(), dtype=float32, numpy=10.0>, <tf.Tensor: shape=(), dtype=int32, numpy=1>)
# (<tf.Tensor: shape=(), dtype=float32, numpy=17.0>, <tf.Tensor: shape=(), dtype=int32, numpy=0>)

7、函數(shù)對指定參數(shù)求導(dǎo)

with tf.GradientTape() as tape:
    w = tf.Variable(tf.constant(value=[3.], shape=(1, 1), dtype=tf.dtypes.float32))
    loss = tf.square(w + 1)
    pass
grad = tape.gradient(target=loss, sources=w)

print(grad) # tf.Tensor([[8.]], shape=(1, 1), dtype=float32)

8、enumerate()函數(shù)：可遍歷每個元素（如列表、元組或字符串），組合為：索引元素，常在for循環(huán)中使用。

list = ['one', 'two', 'three']
for index, element in enumerate(list):
    print(index, element)
    pass
# 0 one
# 1 two
# 2 three

9、獨熱編碼tf.one_hot()：在分類問題中，常用獨熱碼做標簽，標記類別：1表示是，0表示非。
鳶尾花（標簽0表示狗尾草鳶尾，1表示雜色鳶尾，2表示弗吉尼亞鳶尾）
則標簽為1表示分類結(jié)果為雜色鳶尾，用獨熱碼的形式表示為(0. 1. 0.)

# tf.one_hot(indices=待轉(zhuǎn)換數(shù)據(jù), depth=幾分類)

classes = 3 # 三分類
labels = tf.constant([1, 0, 2]) # 輸入的元素最小為0，最大為2
output = tf.one_hot(indices=labels, depth=classes)

print(output)
# tf.Tensor(
# [[0. 1. 0.] 是/非0
#  [1. 0. 0.] 是/非1
#  [0. 0. 1.]], shape=(3, 3), dtype=float32) 是/非2

10、對于分類問題，神經(jīng)網(wǎng)絡(luò)完成前向傳播，計算出每種類型的可能性大小，這些數(shù)字只有符合概率分布之后，才可以與獨熱碼的標簽作比較，此時使用函數(shù)tf.nn.softmax(x)是輸出符合概率分布。
11、tf.argmax()函數(shù)返回沿指定維度最大值的索引

import numpy as np
import tensorflow as tf

test = np.array([[1, 2, 3], [2, 3, 4], [5, 4, 3], [8, 7, 2]])
print('test:\n', test)
print('每一列的最大值的索引：', tf.argmax(test, axis=0))  # 跨行，即每一列最大值
print('每一行的最大值的索引', tf.argmax(test, axis=1))  # 跨列，即每一行最大值
# 每一列的最大值的索引： tf.Tensor([3 3 1], shape=(3,), dtype=int64)
# 每一行的最大值的索引 tf.Tensor([2 2 0 0], shape=(4,), dtype=int64)

四、神經(jīng)網(wǎng)絡(luò)實現(xiàn)鳶尾花分類

1、鳶尾花數(shù)據(jù)集讀入：直接從sklearn包datasets讀入數(shù)據(jù)集，語法為：

from sklearn import datasets

feature_data = datasets.load_iris().data
label_data = datasets.load_iris().target

2、步驟：
①準備數(shù)據(jù)：數(shù)據(jù)集讀入、數(shù)據(jù)集亂序、生成訓(xùn)練集和測試集、配對、每次讀入一小撮。
②搭建網(wǎng)絡(luò)：定義神經(jīng)網(wǎng)絡(luò)中所有可訓(xùn)練參數(shù)。
③參數(shù)優(yōu)化：嵌套循環(huán)迭代，with結(jié)構(gòu)更新參數(shù)，顯示當前l(fā)oss
④測試效果：計算當前參數(shù)前向傳播后的準確率，顯示當前accuracy
⑤accuracy/loss可視化

import tensorflow as tf
from sklearn import datasets # 數(shù)據(jù)集導(dǎo)入
from matplotlib import pyplot as plt # 可視化
import numpy as np

# 數(shù)據(jù)集讀入
feature_data = datasets.load_iris().data # 輸入特征
label_data = datasets.load_iris().target # 標簽

# 數(shù)據(jù)集亂序(原始數(shù)據(jù)有順序，不打亂會影響準確率)
seed = 110 # 設(shè)置同一個隨機種子，保證在亂序后輸入特征和標簽一一對應(yīng)
np.random.seed(seed=seed)
np.random.shuffle(feature_data)
np.random.seed(seed=seed)
np.random.shuffle(label_data)
tf.random.set_seed(seed=seed)

# 將打亂后的數(shù)據(jù)集分割為訓(xùn)練集和測試集，訓(xùn)練集為前120行，測試集為后30行
feature_train = feature_data[:-30]
label_train = label_data[:-30]
feature_test = feature_data[-30:]
label_test = label_data[-30:]

# 強制轉(zhuǎn)變特征數(shù)據(jù)類型，防止矩陣相乘時報錯
feature_train = tf.cast(x=feature_train, dtype=tf.dtypes.float32)
feature_test = tf.cast(x=feature_test, dtype=tf.dtypes.float32)

# 配成[輸入特征, 標簽]對，每次喂入一個batch
train_db = tf.data.Dataset.from_tensor_slices(tensors=(feature_train, label_train)).batch(32)
test_db = tf.data.Dataset.from_tensor_slices(tensors=(feature_test, label_test)).batch(32)

# 生成神經(jīng)網(wǎng)絡(luò)的參數(shù)，4個輸入特征,故輸入層為4個輸入節(jié)點;因為3分類，故輸出層為3個神經(jīng)元
# 用tf.Variable()標記參數(shù)可訓(xùn)練
w = tf.Variable(tf.random.truncated_normal(shape=[4, 3], stddev=0.1))
b = tf.Variable(tf.random.truncated_normal(shape=[1, 3], stddev=0.1))

lr = 0.1 # 學(xué)習率設(shè)置為0.1
train_loss_result = [] # 將每輪的loss記錄在此列表中，為后續(xù)畫loss曲線提供數(shù)據(jù)
test_accuracy = [] # 將每輪的acc記錄在此列表中，為后續(xù)畫acc曲線提供數(shù)據(jù)
epoch = 5000 # 循環(huán)500輪
loss_all = 0 # 每輪分4個step，loss_all記錄四個step生成的4個loss的和

# 訓(xùn)練
for i in range(epoch): # 數(shù)據(jù)集級別的循環(huán)，每個epoch循環(huán)一次數(shù)據(jù)集
    for step, (feature_train, label_train) in enumerate(train_db): # batch級別的循環(huán) ，每個step循環(huán)一個batch
        with tf.GradientTape() as tape:
            y = tf.nn.softmax(tf.matmul(feature_train, w) + b) # 執(zhí)行前向傳播，并使輸出y符合概率分布（此操作后與獨熱碼同量級，可相減求loss）
            y_ = tf.one_hot(indices=label_train, depth=3) # 將標簽值轉(zhuǎn)換為獨熱碼格式，方便計算loss和accuracy
            loss = tf.reduce_mean(tf.square(y - y_)) # 均方誤差損失函數(shù)
            loss_all += loss.numpy()
            pass
        grads = tape.gradient(target=loss, sources=[w, b])

        # 梯度下降 反向傳播更新參數(shù)
        w.assign_sub(lr * grads[0])
        b.assign_sub(lr * grads[1])
        pass

    # 每個epoch，打印loss信息
    print('Epoch {}, loss: {}'.format(i + 1, loss_all / 4))
    train_loss_result.append(loss_all / 4)  # 將4個step的loss求平均記錄在此變量中
    loss_all = 0  # loss_all歸零，為記錄下一個epoch的loss做準備

    # 測試
    # total_correct為預(yù)測對的樣本個數(shù), total_number為測試的總樣本數(shù)，將這兩個變量都初始化為0
    total_correct, total_number = 0, 0
    for feature_test, label_test in test_db:
        # 使用更新后的參數(shù)進行預(yù)測
        y = tf.nn.softmax(tf.matmul(feature_test, w) + b) # 30 x 3
        # 返回預(yù)測結(jié)果中每行最大值的索引，數(shù)值上等于預(yù)測的分類
        index = tf.argmax(input=y, axis=1) 
        index = tf.cast(x=index, dtype=label_test.dtype) # 轉(zhuǎn)換為label_test的數(shù)據(jù)類型
        # 若分類正確，則correct=True，否則為False，并將bool型的結(jié)果轉(zhuǎn)換為int型
        correct = tf.cast(tf.equal(index, label_test), dtype=tf.dtypes.int32)
        # 將每個batch的correct數(shù)加起來
        correct = tf.reduce_sum(correct)
        # 將所有batch中的correct數(shù)加起來
        total_correct += int(correct)
        # total_number為測試的總樣本數(shù)，也就是x_test的行數(shù)，shape[0]返回變量的行數(shù)
        total_number += feature_test.shape[0]
        pass
    # 總的準確率等于total_correct/total_number
    accuracy = total_correct / total_number
    test_accuracy.append(accuracy)
    print('Test_accuracy:', accuracy)
    print('--------------------------')
    pass

# 繪制 loss 曲線
plt.title('Loss Function Curve')  # 圖片標題
plt.xlabel('Epoch')  # x軸變量名稱
plt.ylabel('Loss')  # y軸變量名稱
plt.plot(train_loss_result, label='$Loss$')  # 逐點畫出trian_loss_results值并連線，連線圖標是Loss
plt.legend()  # 畫出曲線圖標
plt.show()  # 畫出圖像

# 繪制 Accuracy 曲線
plt.title('Acc Curve')  # 圖片標題
plt.xlabel('Epoch')  # x軸變量名稱
plt.ylabel('Acc')  # y軸變量名稱
plt.plot(test_accuracy, label='$Accuracy$')  # 逐點畫出test_acc值并連線，連線圖標是Accuracy
plt.legend()
plt.show()