DarkLink
diff --git a/‎docs/01_TRAINING_GUIDE.md‎
Lines changed: 55 additions & 10 deletions b/‎docs/01_TRAINING_GUIDE.md‎
Lines changed: 55 additions & 10 deletions
diff --git a/‎docs/en/01_TRAINING_GUIDE.md‎
Lines changed: 60 additions & 15 deletions b/‎docs/en/01_TRAINING_GUIDE.md‎
Lines changed: 60 additions & 15 deletions
diff --git a/‎quantpits/scripts/incremental_train.py‎
Lines changed: 6 additions & 1 deletion b/‎quantpits/scripts/incremental_train.py‎
Lines changed: 6 additions & 1 deletion
diff --git a/‎quantpits/scripts/plot_model_opinions.py‎
Lines changed: 4 additions & 1 deletion b/‎quantpits/scripts/plot_model_opinions.py‎
Lines changed: 4 additions & 1 deletion
@@ -9,6 +9,7 @@
 | `prod_train_predict.py` | 全量训练+预测 | ✅ | configs | `latest_train_records.json` |
 | `incremental_train.py` | 增量训练+预测 | ✅ | configs | `latest_train_records.json` |
 | `prod_predict_only.py` | 仅预测 | ❌ | 已有模型 | `latest_train_records.json` |
+| `pretrain.py` | 基础模型预训练 | ✅ | configs | `data/pretrained/` (state_dict) |
 
 两个脚本都会在修改 `latest_train_records.json` 之前自动备份历史到 `data/history/`。
 
@@ -23,6 +24,7 @@ QuantPits/
 │   │   ├── prod_train_predict.py   # 全量训练脚本
 │   │   ├── incremental_train.py      # 增量训练脚本
 │   │   ├── prod_predict_only.py    # 仅预测脚本（不训练）
+│   │   ├── pretrain.py               # 🧠 基础模型预训练脚本
 │   │   ├── check_workflow_yaml.py    # 🔧 YAML配置生产环境参数验证
 │   │   └── train_utils.py            # 共享工具模块
 │   └── docs/
@@ -39,6 +41,7 @@ QuantPits/
         │   └── model_performance_*.json # 模型成绩
         ├── data/
         │   ├── history/              # 📦 自动备份的历史文件
+        │   ├── pretrained/           # 🧠 预训练基模型 (.pkl + .json)
         │   └── run_state.json        # 增量训练运行状态
         └── latest_train_records.json # 当前训练记录
 ```
@@ -59,10 +62,15 @@ models:
     market: csi300                  # 目标市场（作为元数据标签用于命令行筛选）
     yaml_file: config/workflow_config_gru.yaml  # Qlib 工作流配置
     enabled: true                   # 是否参与全量训练
-    tags: [ts]                      # 分类标签（用于筛选）
+    tags: [basemodel, ts]           # 分类标签（用于筛选）
+    pretrain_source: lstm_Alpha158  # (可选) 声明依赖的基础模型
     notes: "可选备注"                # 备注信息
 ```
 
+#### 关键字段说明：
+- **`tags: [basemodel]`**: 标记该模型可作为预训练基础模型。
+- **`pretrain_source`**: 标记该上层模型依赖哪个基础模型。系统会自动寻找对应的 `_latest.pkl`。
+
 > [!NOTE]
 > **关于市场配置的区别**：注册表中的 `market` 字段是**模型元数据标签**，专门用于在执行增量训练或预测时通过 `--market` 参数进行筛选过滤。实际拉取量价数据时，系统依据的是 `model_config.json` 中的全局 `market` 设置。
 
@@ -308,17 +316,54 @@ python quantpits/scripts/check_workflow_yaml.py --fix
 
 ---
 
+---
+
+## 基础模型预训练 (`pretrain.py`)
+
+某些复杂模型（如 GATs, ADD, IGMTF）需要一个预训练好的基模型（如 LSTM 或 GRU）作为权重初始化。
+
+### 使用场景
+- 需要为上层模型提供初始化权重。
+- 修改了 Feature (d_feat)，需要重新训练兼容的基础模型。
+
+### 核心语义
+- **预训练不计入训练记录**：不修改 `latest_train_records.json`。
+- **元数据校验**：每个预训练文件附带 `.json` 元数据。如果上层模型的 `d_feat` 与预训练文件不符，系统会报错阻断。
+
+### 常用命令
+
+```bash
+# 1. 列出可预训练模型及其依赖关系
+python quantpits/scripts/pretrain.py --list
+
+# 2. 预训练指定基础模型
+python quantpits/scripts/pretrain.py --models lstm_Alpha158
+
+# 3. 为特定上层模型预训练（最推荐：自动对齐 Dataset 配置）
+# 即使修改了 Feature，该命令也能确保基础模型与上层模型完全兼容
+python quantpits/scripts/pretrain.py --for gats_Alpha158_plus
+
+# 4. 查看已有预训练文件
+python quantpits/scripts/pretrain.py --show-pretrained
+
+# 5. 强制使用随机权重（跳过预训练）
+# 在 incremental_train 或 prod_predict_only 中均可用
+python quantpits/scripts/incremental_train.py --models gats_Alpha158_plus --no-pretrain
+```
+
+---
+
 ## 关于 LSTM 和 GATs
 
-- `lstm_Alpha158` 模型训练时会自动输出 `csi300_lstm_ts_latest.pkl`
-- 该 pkl 文件是 GATs 模型的 `basemodel`
-- GATs 模型配置中引用了此文件
-- 目前 LSTM 和 GATs 都设为 `enabled: false`
-- 如需使用 GATs，需先训练 LSTM：
-  ```bash
-  python quantpits/scripts/incremental_train.py --models lstm_Alpha158
-  python quantpits/scripts/incremental_train.py --models gats_Alpha158_plus
-  ```
+- `gats_Alpha158_plus` 默认依赖 `lstm_Alpha158`。
+- 训练全流程：
+  1. 预训练基模型（可选，已有则跳过）：
+     `python quantpits/scripts/pretrain.py --for gats_Alpha158_plus`
+  2. 训练上层模型：
+     `python quantpits/scripts/incremental_train.py --models gats_Alpha158_plus`
+
+- 如果不想使用预训练模型，只需加上 `--no-pretrain` 标志。
+
 
 ---
 
 
@@ -4,11 +4,12 @@
 
 The training system consists of three main scripts that share the same utility modules and model registry:
 
-| Script | Purpose | Save Semantics |
-|------|------|----------|
-| `prod_train_predict.py` | Full training of all enabled models | **Full Overwrite** of `latest_train_records.json` |
-| `incremental_train.py` | Selective training of individual models | **Incremental Merge** to `latest_train_records.json` |
-| `prod_predict_only.py` | Prediction only (no training) | **Incremental Merge** to `latest_train_records.json` |
+| Script | Purpose | Training | Data Source | Save Semantics |
+|------|------|------|--------|----------|
+| `prod_train_predict.py` | Full training+predict | ✅ | configs | `latest_train_records.json` |
+| `incremental_train.py` | Incremental training+predict | ✅ | configs | `latest_train_records.json` |
+| `prod_predict_only.py` | Prediction only | ❌ | Existing models | `latest_train_records.json` |
+| `pretrain.py` | Base model pre-training | ✅ | configs | `data/pretrained/` (state_dict) |
 
 Both scripts automatically back up the history to `data/history/` before modifying `latest_train_records.json`.
 
@@ -23,6 +24,7 @@ QuantPits/
 │   │   ├── prod_train_predict.py   # Full training script
 │   │   ├── incremental_train.py      # Incremental training script
 │   │   ├── prod_predict_only.py    # Prediction-only script (no training)
+│   │   ├── pretrain.py               # 🧠 Base model pre-training script
 │   │   ├── check_workflow_yaml.py    # 🔧 YAML config production validation & fix
 │   │   └── train_utils.py            # Shared utility module
 │   └── docs/
@@ -39,6 +41,7 @@ QuantPits/
         │   └── model_performance_*.json # Model performance metrics (IC/ICIR)
         ├── data/
         │   ├── history/              # 📦 Auto-backed up historical files
+        │   ├── pretrained/           # 🧠 Pre-trained base models (.pkl + .json)
         │   └── run_state.json        # State tracker for incremental training
         └── latest_train_records.json # Current training records
 ```
@@ -59,10 +62,15 @@ models:
     market: csi300                  # Target market (Metadata tag used for CLI filtering)
     yaml_file: config/workflow_config_gru.yaml  # Qlib workflow config
     enabled: true                   # Whether to participate in full training
-    tags: [ts]                      # Classification tags (for filtering)
+    tags: [basemodel, ts]           # Classification tags (for filtering)
+    pretrain_source: lstm_Alpha158  # (Optional) Declare dependency on base model
     notes: "Optional notes"         # Notes
 ```
 
+#### Key Fields:
+- **`tags: [basemodel]`**: Marks the model as a pre-trainable base model.
+- **`pretrain_source`**: Tells the system which base model this upper-layer model depends on. The system will automatically look for the corresponding `_latest.pkl`.
+
 > [!NOTE]
 > **Distinction of Market Configurations**: The `market` field in the registry acts strictly as a **Model Metadata Tag** intended for CLI selection filtering via `--market` during incremental training or predictions. Actual data extraction bounds are perpetually steered by the global `market` setting inside `model_config.json`.
 
@@ -308,17 +316,54 @@ python quantpits/scripts/check_workflow_yaml.py --fix
 
 ---
 
+---
+
+## Base Model Pre-training (`pretrain.py`)
+
+Complex models (e.g., GATs, ADD, IGMTF) require a pre-trained base model (e.g., LSTM or GRU) for weight initialization.
+
+### Usage Scenarios
+- Providing initialization weights for upper-layer models.
+- When features (d_feat) are modified, requiring new compatible base models.
+
+### Core Semantics
+- **Pre-training is not logged in records**: It does not modify `latest_train_records.json`.
+- **Metadata Validation**: Each pre-trained file comes with a `.json` metadata file. If an upper model's `d_feat` doesn't match the pre-trained file, training will be blocked.
+
+### Common Commands
+
+```bash
+# 1. List pre-trainable models and dependencies
+python quantpits/scripts/pretrain.py --list
+
+# 2. Pre-train a specific base model
+python quantpits/scripts/pretrain.py --models lstm_Alpha158
+
+# 3. Pre-train FOR a specific upper model (Recommended: Aligns dataset config)
+# This ensures perfect compatibility even if features are modified.
+python quantpits/scripts/pretrain.py --for gats_Alpha158_plus
+
+# 4. Show existing pre-trained files
+python quantpits/scripts/pretrain.py --show-pretrained
+
+# 5. Force random weights (Skip pre-training)
+# Available in both incremental_train and prod_predict_only
+python quantpits/scripts/incremental_train.py --models gats_Alpha158_plus --no-pretrain
+```
+
+---
+
 ## Concerning LSTM and GATs
 
-- The `lstm_Alpha158` model automatically outputs `csi300_lstm_ts_latest.pkl` upon training.
-- This `.pkl` is a required `basemodel` for GATs.
-- GATs configurations implicitly reference this file.
-- Both LSTM and GATs are presently defaulted to `enabled: false`.
-- If GATs is desired, the LSTM must be trained chronologically prior:
-  ```bash
-  python quantpits/scripts/incremental_train.py --models lstm_Alpha158
-  python quantpits/scripts/incremental_train.py --models gats_Alpha158_plus
-  ```
+- `gats_Alpha158_plus` depends on `lstm_Alpha158` by default.
+- Full Workflow:
+  1. Pre-train base model (Optional if already exists):
+     `python quantpits/scripts/pretrain.py --for gats_Alpha158_plus`
+  2. Train upper model:
+     `python quantpits/scripts/incremental_train.py --models gats_Alpha158_plus`
+
+- To bypass pre-training and use random weights, use the `--no-pretrain` flag.
+
 
 ---
 
 
@@ -100,6 +100,8 @@ def parse_args():
                       help='仅打印待训练模型列表，不实际训练')
     ctrl.add_argument('--experiment-name', type=str, default=None,
                       help='MLflow 实验名称 (默认: Prod_Train_{FREQ})')
+    ctrl.add_argument('--no-pretrain', action='store_true',
+                      help='忽略 pretrain_source，使用随机权重初始化 basemodel')
 
     # 信息查看
     info = parser.add_argument_group('信息查看')
@@ -260,7 +262,10 @@ def run_incremental_train(args):
 
         yaml_file = model_info['yaml_file']
 
-        result = train_single_model(model_name, yaml_file, params, experiment_name)
+        result = train_single_model(
+            model_name, yaml_file, params, experiment_name,
+            no_pretrain=args.no_pretrain
+        )
 
         if result['success']:
             new_records['models'][model_name] = result['record_id']
 
@@ -67,11 +67,13 @@ def main():
     # X 轴为各个模型/Combo，Y 轴为排名，每一根线代表一只股票
     plt.figure(figsize=(14, 8))
 
+    plotted = False
     for instrument in rank_df.index:
         y_values = rank_df.loc[instrument]
         if y_values.isna().all():
             continue
         plt.plot(rank_df.columns, y_values, marker='o', alpha=0.7, label=instrument)
+        plotted = True
 
     # Y轴刻度反转，使得排名第 1 的在最上面
     plt.gca().invert_yaxis()
@@ -83,7 +85,8 @@ def main():
     plt.title(f'Model Prediction Rank Comparison - {os.path.basename(csv_file)}')
 
     # Legend outside
-    plt.legend(bbox_to_anchor=(1.02, 1), loc='upper left', borderaxespad=0., title="Instrument")
+    if plotted:
+        plt.legend(bbox_to_anchor=(1.02, 1), loc='upper left', borderaxespad=0., title="Instrument")
 
     plt.tight_layout()