Skip to content

Commit d3354e8

Browse files
committed
fix bug from qa
1 parent e6e00ac commit d3354e8

File tree

8 files changed

+24
-18
lines changed

8 files changed

+24
-18
lines changed

datasets/Avazu_flen/data_config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414

1515

1616
runner:
17-
raw_file_dir: "path" # raw_data dir
17+
raw_file_dir: "raw_file/train" # raw_data dir
1818
raw_filled_file_dir: "./raw_data" # raw_data_filled dir
1919
train_data_dir: "./train_data_full" # train datasets
2020
test_data_dir: "./test_data_full" # test datasets

datasets/Avazu_flen/preprocess.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ def __init__(self, config):
5959
self.min_threshold = self.config.get("runner.min_threshold")
6060
self.feature_map_cache = self.config.get("runner.feature_map_cache")
6161

62-
# self.filled_raw()
62+
self.filled_raw()
6363

6464
self.init()
6565

datasets/Avazu_flen/readme.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@
22
#### 1.Get raw datasets:
33
you can go to:[https://www.kaggle.com/c/avazu-ctr-prediction/data](https://www.kaggle.com/c/avazu-ctr-prediction)
44

5-
将下载的原始数据目录配置在data_config.yaml中,执行命令获取全量数据
5+
将下载的数据解压后,只保留训练集即可,且命名为`train``
66

77
| 名称 | 说明 |
88
| -------- | -------- |
9-
| raw_file_dir | 原始数据集目录 |
9+
| raw_file | 原始数据集目录 |
1010
| raw_filled_file_dir | 原始数据缺失值处理后的目录 |
1111
| train_data_dir | 训练集存放目录 |
1212
| test_data_dir | 测试集存放目录 |
@@ -15,9 +15,9 @@ you can go to:[https://www.kaggle.com/c/avazu-ctr-prediction/data](https://www
1515
| feature_map_cache | 特征缓存数据 |
1616

1717

18-
18+
然后执行脚本
1919
```bash
20-
sh data_process.sh
20+
sh run.sh
2121
```
2222
#### 2.Get preprocessd datasets:
2323
you can also go to: [AiStudio数据集](https://aistudio.baidu.com/aistudio/datasetdetail/125200)

datasets/Avazu_flen/run.sh

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,7 @@
1+
mkdir train_data_full
2+
mkdir test_data_full
3+
mkdir raw_file
4+
mkdir raw_filled_file_dir
5+
mv train ./raw_file
6+
17
python preprocess.py -m data_config.yaml

models/rank/dcn_v2/README.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,7 @@
44

55
```
66
├── data # 样例数据
7-
├── sample_data # 样例数据
8-
├── train
9-
├── sample_train.txt # 训练数据样例
7+
├── sample_train.txt # 训练数据样例
108
├── __init__.py
119
├── README.md # 文档
1210
├── config.yaml # sample数据配置

models/rank/flen/README.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -63,8 +63,14 @@ os : windows/linux/macos
6363

6464
## 快速开始
6565

66-
67-
本文提供了[FLEN-Paddle AiStudio项目](https://aistudio.baidu.com/aistudio/projectdetail/3247609)可以供您快速体验,进入项目快速开始。
66+
本文提供了样例数据可以供您快速体验,在任意目录下均可执行。在FLEN模型目录的快速执行命令如下:
67+
```bash
68+
# 进入模型目录
69+
# cd models/rank/flen # 在任意目录均可运行
70+
# 动态图训练
71+
python -u ../../../tools/trainer.py -m config.yaml # 全量数据运行config_bigdata.yaml
72+
# 动态图预测
73+
python -u ../../../tools/infer.py -m config.yaml # 全量数据运行config_bigdata.yaml
6874

6975

7076
## 模型组网

models/rank/flen/config.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414

1515

1616
runner:
17-
train_data_dir: "./data/sample_data/dataset"
17+
train_data_dir: "./data/sample_data/train"
1818
train_reader_path: "avazu_reader" # importlib format
1919
use_gpu: False
2020
use_auc: True
@@ -25,7 +25,7 @@ runner:
2525

2626
#model_init_path: "output_model/0" # init model
2727
model_save_path: "output_model_flen"
28-
test_data_dir: "./data/sample_data/dataset" #"../../../../data/test"
28+
test_data_dir: "./data/sample_data/train" #"../../../../data/test"
2929
infer_reader_path: "avazu_reader" # importlib format
3030
infer_batch_size: 3 #512
3131
infer_load_path: "output_model_flen"
@@ -41,7 +41,7 @@ hyper_parameters:
4141
learning_rate: 0.04
4242
strategy: async
4343
# user-defined <key, value> pairs
44-
sparse_inputs_slots: 23
44+
sparse_inputs_slots: 22
4545
sparse_feature_number: 20 #1544488
4646
sparse_num_field: 3
4747
sparse_feature_dim: 32

models/recall/mhcn/README.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,6 @@
66

77
```
88
├── data # 样例数据
9-
├── train
10-
├── train.txt
11-
├── test
12-
├── test.txt
139
├── ratings.txt
1410
├── trusts.txt
1511
├── __init__.py

0 commit comments

Comments
 (0)