NexusGPU · knave · Oct 13, 2025
diff --git a/docs/en/guide/recipes/configure-autoscaling.md b/docs/en/guide/recipes/configure-autoscaling.md
@@ -1,20 +1,30 @@
 # Configure AutoScaling for AI Workloads
 
-## Enable AutoScaling
+## Step 1. Enable AutoScaling
 
-### Simple Configuration with Pod AutoScaling Annotations
+### Add Pod AutoScaling Annotations
+
+> To be used in conjunction with workload annotations：[Create Workload](/guide/recipes/create-workload#add-pod-annotations)
 
 ```yaml
   # Enable vertical scaling
-  autoResources: true
+  tensor-fusion.ai/auto-resources: 'true'
   # Configure target resource, options: all|tflops|vram, if empty only provides recommendations
-  targetResource: all 
+  tensor-fusion.ai/auto-scale-target-resource: all 
   # Enable horizontal scaling
-  autoReplicas: true
+  tensor-fusion.ai/auto-replicas: 'true'
 ```
 
 ### Detailed Configuration Using Workload Configuration File
 
+* Vertical Scaling: Based on historical GPU resource usage data, the community VPA Histogram algorithm is employed.
+The estimates generated by the VPA algorithm consist of Target, LowerBound, and UpperBound, corresponding by default to P90, P50, and P95 usage levels.
+If the current resource usage falls outside the LowerBound and UpperBound range, a recommended value is generated.
+
+>[!NOTE] Note: If enable is not set to true, or if targetResource is empty, only resource recommendations will be generated, and the recommended values will not be applied in practice.
+
+* Cron Scaling: Based on standard cron expressions, scaling takes effect when `enable` is `true` and within the `start` and `end` time range. Outside this range, resources revert to the values specified when the workload was added. [Cron Expression Reference](https://en.wikipedia.org/wiki/Cron)
+
 ```yaml
 autoScalingConfig:
     # Vertical scaling configuration
@@ -60,9 +70,9 @@ autoScalingConfig:
             vram: 5Gi
 ```
 
-## Monitor Scaling Status
+## Step 2. Monitor Scaling Status
 
-### View GPU Resource Recommendations via TensorFusionWorkload Status
+> The workload generates a corresponding `TensorFusionWorkload` resource object, and the fields in `Status` reflect the current scaling status in real time.
 
 ```yaml
 status:

diff --git a/docs/zh/guide/recipes/configure-autoscaling.md b/docs/zh/guide/recipes/configure-autoscaling.md
@@ -1,20 +1,30 @@
 # 配置AI应用自动扩缩容
 
-## 开启自动扩缩容
+## 步骤 1. 开启自动扩缩容
 
-### 添加Pod自动扩缩容注解进行简单配置
+### 添加Pod自动扩缩容注解
+
+> 需配合工作负载注解使用：[创建工作负载](/zh/guide/recipes/create-workload#添加pod注解)
 
 ```yaml
   # 开启垂直扩缩容
-  autoResources: true
+  tensor-fusion.ai/auto-resources: 'true'
   # 配置目标资源, 可填all|tflops|vram，若为空则只推荐不更新
-  targetResource: all 
+  tensor-fusion.ai/auto-scale-target-resource: all 
   # 开启水平扩缩容
-  autoReplicas: true
+  tensor-fusion.ai/auto-replicas: 'true'
 ```
 
 ### 使用工作负载配置文件进行详细配置
 
+* 垂直扩缩容: 基于GPU资源历史使用量数据，采用社区VPA Histogram算法实现。
+ VPA算法生成的估算值由Target、LowerBound、UpperBound组成，默认对应P90、P50、P95用量。
+ 若当前资源用量在LowerBound和UpperBound范围外，则生成推荐值。
+
+>[!NOTE] 注意：若enable不为true，或者targetResource为空，则只推荐资源值，不会实际应用资源推荐值
+
+* 定时扩缩容: 基于标准的cron表达式，当`enable`为`true`，并且在`start`和`end`范围内生效，超出时间范围则恢复至添加工作负载时指定的资源值，[Cron表达式参考](https://en.wikipedia.org/wiki/Cron)
+
 ```yaml
 autoScalingConfig:
     # 垂直扩缩容配置
@@ -23,17 +33,17 @@ autoScalingConfig:
       enable: true
       # 目标资源
       targetResource: all
-      # 计算TFLOPS目标值百分位数， 默认值：0.9
+      # 计算TFLOPS目标值百分位， 默认值：0.9
       targetTflopsPercentile: 0.9
-      # 计算TFLOPS下边界值百分位数，默认值：0.5
+      # 计算TFLOPS下边界值百分位，默认值：0.5
       lowerBoundTflopsPercentile: 0.5
-      # 计算TFLOPS上边界值百分位数，默认值：0.95
+      # 计算TFLOPS上边界值百分位，默认值：0.95
       upperBoundTflopsPercentile: 0.95
-      # 计算VRAM目标值百分位数，默认值：0.9
+      # 计算VRAM目标值百分位，默认值：0.9
       targetVramPercentile: 0.9
-      # 计算VRAM下边界值百分位数，默认值：0.5
+      # 计算VRAM下边界值百分位，默认值：0.5
       lowerBoundVramPercentile: 0.5
-      # 计算VRAM上边界值百分位数，默认值：0.95
+      # 计算VRAM上边界值百分位，默认值：0.95
       upperBoundVramPercentile: 0.95
       # 请求估算值扩大系数 默认值：0.15
       requestMarginFraction: 0.15
@@ -43,7 +53,7 @@ autoScalingConfig:
     # 定时扩缩容配置
     cronScalingRules:
         # 是否启用该规则
-      - enable: True
+      - enable: true
         # 规则名称
         name: "test"
         # 规则生效起始时间
@@ -60,9 +70,9 @@ autoScalingConfig:
             vram: 5Gi
 ```
 
-## 观测扩缩容状态
+## 步骤 2. 观测扩缩容状态
 
-### 通过TensorFusionWorkload Status查看GPU资源推荐值
+> 工作负载会生成对应的`TensorFusionWorkload`资源对象，`Status`中的字段会实时反应当前扩缩容状态
 
 ```yaml
 status: