fix

Ricardo-shuo-liu · Ricardo-shuo-liu · commit 41f2d90b250e · 2025-11-10T22:09:21.000+08:00
diff --git a/_typos.toml b/_typos.toml
@@ -55,11 +55,6 @@ instrinsics = "instrinsics"
 interchangable = "interchangable"
 intializers = "intializers"
 intput = "intput"
-lable = "lable"
-learing = "learing"
-legth = "legth"
-lenth = "lenth"
-leran = "leran"
 libary = "libary"
 mantained = "mantained"
 matrics = "matrics"
diff --git a/docs/api/paddle/static/accuracy_cn.rst b/docs/api/paddle/static/accuracy_cn.rst
@@ -10,7 +10,7 @@ accuracy
 
 accuracy layer。参考 https://en.wikipedia.org/wiki/Precision_and_recall
 
-使用输入和标签计算准确率。如果正确的标签在 topk 个预测值里，则计算结果加 1。注意：输出正确率的类型由 input 类型决定，input 和 lable 的类型可以不一样。
+使用输入和标签计算准确率。如果正确的标签在 topk 个预测值里，则计算结果加 1。注意：输出正确率的类型由 input 类型决定，input 和 label 的类型可以不一样。
 
 参数
 ::::::::::::
diff --git a/docs/design/memory/memory_optimization.md b/docs/design/memory/memory_optimization.md
@@ -53,7 +53,7 @@ In compilers, the front end of the compiler translates programs into an intermed
 
 Therefore, the compiler needs to analyze the intermediate-representation program to determine which temporary variables are in use at the same time. We say a variable is "live" if it holds a value that may be needed in the future, so this analysis is called liveness analysis.
 
-We can leran these techniques from compilers. There are mainly two stages to make live variable analysis:
+We can learn these techniques from compilers. There are mainly two stages to make live variable analysis:
 
 - construct a control flow graph
 - solve the dataflow equations
diff --git a/docs/practices/gan/cyclegan/cyclegan.ipynb b/docs/practices/gan/cyclegan/cyclegan.ipynb
diff --git a/docs/practices/nlp/transformer_in_English-to-Spanish.ipynb b/docs/practices/nlp/transformer_in_English-to-Spanish.ipynb
@@ -1170,7 +1170,7 @@
    "source": [
     "### 4.2 Encoder\n",
     "Encoder部分主要包含了多头注意力机制、归一化层以及前馈神经网络。输入会依次经过多头注意力模块、归一化层构成的残差模块、前馈神经网络模块、归一化层构成的残差模块。\n",
-    "* 多头注意力机制（MultiHeadAttention）：使用[paddle.nn.MultiHeadAttention](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/MultiHeadAttention_cn.html#multiheadattention)实现多头注意力机制，需要注意其掩码attn_mask需要的shape是[batch_szie,num_heads,sequence_legth,sequence_legth]。\n",
+    "* 多头注意力机制（MultiHeadAttention）：使用[paddle.nn.MultiHeadAttention](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/MultiHeadAttention_cn.html#multiheadattention)实现多头注意力机制，需要注意其掩码attn_mask需要的shape是[batch_szie,num_heads,sequence_length,sequence_length]。\n",
     "* 前馈神经网络（Feed Forward）：输入经过MultiHeadAttention层后，经过一层feed forward层。模型中的feed forward，采用的是一种position-wise feed-forward的方法，即先对输入加一个全连接网络，之后使用Relu激活，之后再加一个全连接网络。\n",
     "* 残差网络：由归一化（LayerNorm）后的结果与之前时刻的输入相加组成。LayerNorm会在每一个样本上计算均值和方差。\n"
    ]

Original file line number	Diff line number	Diff line change
`@@ -1170,7 +1170,7 @@`
`1170`	`1170`	`"source": [`
`1171`	`1171`	`"### 4.2 Encoder\n",`
`1172`	`1172`	`"Encoder部分主要包含了多头注意力机制、归一化层以及前馈神经网络。输入会依次经过多头注意力模块、归一化层构成的残差模块、前馈神经网络模块、归一化层构成的残差模块。\n",`
`1173`		`- "* 多头注意力机制（MultiHeadAttention）：使用[paddle.nn.MultiHeadAttention](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/MultiHeadAttention_cn.html#multiheadattention)实现多头注意力机制，需要注意其掩码attn_mask需要的shape是[batch_szie,num_heads,sequence_legth,sequence_legth]。\n",`
	`1173`	`+ "* 多头注意力机制（MultiHeadAttention）：使用[paddle.nn.MultiHeadAttention](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/MultiHeadAttention_cn.html#multiheadattention)实现多头注意力机制，需要注意其掩码attn_mask需要的shape是[batch_szie,num_heads,sequence_length,sequence_length]。\n",`
`1174`	`1174`	`"* 前馈神经网络（Feed Forward）：输入经过MultiHeadAttention层后，经过一层feed forward层。模型中的feed forward，采用的是一种position-wise feed-forward的方法，即先对输入加一个全连接网络，之后使用Relu激活，之后再加一个全连接网络。\n",`
`1175`	`1175`	`"* 残差网络：由归一化（LayerNorm）后的结果与之前时刻的输入相加组成。LayerNorm会在每一个样本上计算均值和方差。\n"`
`1176`	`1176`	`]`