📝 更新Transformer文档，修正词表构建部分的注释

jiangyangcreate · jiangyangcreate · commit a87fed918d42 · 2025-04-19T12:03:35.000+08:00
- 在词表构建的代码中增加了对“我”字的独热编码示例注释，提升了代码的可读性和理解性。
- 确保文档内容更加清晰，便于用户理解Encoder-Decoder结构的实现细节。
diff --git a/docs/docs/机器学习/Transformer/Encoder-Decoder结构.md b/docs/docs/机器学习/Transformer/Encoder-Decoder结构.md
@@ -32,9 +32,10 @@ title: 🚧Encoder-Decoder结构
 | \<PAD\>    | 0.00  | 0.00  | 0.00  | 0.00  |
 | \<UNK\>    | 0.12  | -0.51 | 0.32  | 0.89  |
 | 我         | 0.87  | 0.42  | -0.26 | 0.35  |
-| 爱         | 0.65  | 0.71  | 0.38  | -0.15 |
 | 机器       | 0.32  | 0.52  | 0.75  | 0.22  |
 | 学习       | 0.45  | 0.68  | 0.21  | 0.37  |
+| 爱         | 0.65  | 0.71  | 0.38  | -0.15 |
+
 
 当输入一个句子"我爱学习机器学习"时，会被分词为["我", "爱", "学习", "机器", "学习"]，然后每个词在嵌入矩阵中查找对应的向量，得到一系列向量表示：
 
@@ -77,7 +78,9 @@ embedding_matrix = np.array([
 def word_to_onehot(word, vocab_size):
     onehot = np.zeros(vocab_size)
     if word in vocab:
-        onehot[vocab[word]] = 1 # 对应下标位置的值为1，其他为0.
+        onehot[vocab[word]] = 1 
+        # 以 “我”这个字为例，vocab[“我”] 为2，对应下标位置的值为1，其他为0. 
+        # onehot[2] = 1 即[0,0,1,0,0,.....,0]
     else:
         onehot[vocab["<UNK>"]] = 1 # 即未知为1.其他为0 [0,1,0,0,0,.....,0]
     return onehot # 形状为 ：1,5000