PaddlePaddle
diff --git a/‎README.md‎
Lines changed: 2 additions & 2 deletions b/‎README.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎doc/inference.md‎
Lines changed: 39 additions & 3 deletions b/‎doc/inference.md‎
Lines changed: 39 additions & 3 deletions
diff --git a/‎doc/serving.md‎
Lines changed: 11 additions & 2 deletions b/‎doc/serving.md‎
Lines changed: 11 additions & 2 deletions
diff --git a/‎models/multitask/ple/README.md‎
Lines changed: 95 additions & 0 deletions b/‎models/multitask/ple/README.md‎
Lines changed: 95 additions & 0 deletions
diff --git a/‎models/multitask/ple/__init__.py‎
Lines changed: 13 additions & 0 deletions b/‎models/multitask/ple/__init__.py‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎models/multitask/ple/census_reader.py‎
Lines changed: 50 additions & 0 deletions b/‎models/multitask/ple/census_reader.py‎
Lines changed: 50 additions & 0 deletions
diff --git a/‎models/multitask/ple/config.yaml‎
Lines changed: 43 additions & 0 deletions b/‎models/multitask/ple/config.yaml‎
Lines changed: 43 additions & 0 deletions
@@ -69,10 +69,10 @@
     |   排序   |                    [FGCNN](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/fgcnn/)                    |    ✓    |    ✓    |     ✓     |     ✓     | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [WWW 2019][Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction](https://arxiv.org/pdf/1904.04447.pdf)                                                                      |
     |   排序   |                  [Fibinet](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/fibinet/)                  |    ✓    |    ✓    |     ✓     |     ✓     | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [RecSys19][FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction]( https://arxiv.org/pdf/1905.09433.pdf)                                                 |
     |   排序   |                     [Flen](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/flen/)                     |    ✓    |    ✓    |     ✓     |     ✓     | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [2019][FLEN: Leveraging Field for Scalable CTR Prediction]( https://arxiv.org/pdf/1911.04690.pdf)                                                                                                           |
-    |  多任务  |                  PLE                   |    ✓    |    ✓    |     ✓     |     ✓     |  1.8.5 | [RecSys 2020][Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations](https://dl.acm.org/doi/abs/10.1145/3383313.3412236)                                                              |
+    |  多任务  |                  [PLE](models/multitask/ple)                   |    ✓    |    ✓    |     ✓     |     ✓     |  2.0 | [RecSys 2020][Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations](https://dl.acm.org/doi/abs/10.1145/3383313.3412236)                                                              |
     |  多任务  |                  [ESMM](models/multitask/esmm/)                   |    ✓    |    ✓    |     ✓     |     ✓     | 2.0 | [SIGIR 2018][Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate](https://arxiv.org/abs/1804.07931)                                                              |
     |  多任务  |                  [MMOE](models/multitask/mmoe/)                   |    ✓    |    ✓    |     ✓     |     ✓     | 2.0 | [KDD 2018][Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts](https://dl.acm.org/doi/abs/10.1145/3219819.3220007)                                                       |
-    |  多任务  |           [ShareBottom](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/multitask/share-bottom/)           |    ✓    |    ✓    |     ✓     |     ✓     | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [1998][Multitask learning](http://reports-archive.adm.cs.cmu.edu/anon/1997/CMU-CS-97-203.pdf)                                                                                                               |
+    |  多任务  |           [ShareBottom](models/multitask/share_bottom/)           |    ✓    |    ✓    |     ✓     |     ✓     | 2.0 | [1998][Multitask learning](http://reports-archive.adm.cs.cmu.edu/anon/1997/CMU-CS-97-203.pdf)                                                                                                               |
     |  重排序  |                [Listwise](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rerank/listwise/)                |    ✓    |    ✓    |     ✓     |     x     | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [2019][Sequential Evaluation and Generation Framework for Combinatorial Recommender System](https://arxiv.org/pdf/1902.00245.pdf)                                                                           |
 
 
 
@@ -1,5 +1,5 @@
 # Paddle Inference的使用方法
-paddlerec目前提供在静态图训练时使用save_inference_model接口保存模型，以及将保存的模型使用Inference预测库进行服务端部署的功能。本教程将以wide_deep模型为例，说明如何使用这两项功能。  
+paddlerec目前提供在静态图训练时使用save_inference_model接口保存模型，动态图训练后将保存的模型转化为静态图的样式，以及将保存的模型使用Inference预测库进行服务端部署的功能。本教程将以wide_deep模型为例，说明如何使用这三项功能。  
 
 ## 使用save_inference_model接口保存模型
 在服务器端使用python部署需要先使用save_inference_model接口保存模型。  
@@ -12,8 +12,8 @@ runner:
   ...
   # use inference save model
   use_inference: True  # 静态图训练时保存为inference model
-  save_inference_feed_varnames: ["label","C1","C2","C3","C4","C5","C6","C7","C8","C9","C10","C11","C12","C13","C14","C15","C16","C17","C18","C19","C20","C21","C22","C23","C24","C25","C26","dense_input"] # inference model 的feed参数的名字
-  save_inference_fetch_varnames: ["cast_0.tmp_0"] # inference model 的fetch参数的名字
+  save_inference_feed_varnames: ["C1","C2","C3","C4","C5","C6","C7","C8","C9","C10","C11","C12","C13","C14","C15","C16","C17","C18","C19","C20","C21","C22","C23","C24","C25","C26","dense_input"] # inference model 的feed参数的名字
+  save_inference_fetch_varnames: ["sigmoid_0.tmp_0"] # inference model 的fetch参数的名字
 ```
 3. 启动静态图训练
 ```bash
@@ -23,6 +23,39 @@ runner:
 python -u ../../../tools/static_trainer.py -m config.yaml # 全量数据运行config_bigdata.yaml 
 ```
 
+## 使用to_static.py脚本转化动态图保存下来的模型
+若您在使用动态图训练完成,希望将保存下来的模型转化为静态图inference，那么可以参考我们提供的to_static.py脚本。
+1. 首先正常使用动态图训练保存参数
+```bash
+# 进入模型目录
+# cd models/rank/wide_deep # 在任意目录均可运行
+# 动态图训练
+python -u ../../../tools/trainer.py -m config.yaml # 全量数据运行config_bigdata.yaml 
+```
+2. 打开yaml配置，增加`model_init_path`选项  
+to_static.py脚本会先加载`model_init_path`地址处的模型，然后再转化为静态图保存。注意不要在一开始训练时就打开这个选项，不然会变成热启动训练。
+3. 更改to_static脚本，根据您的模型需求改写其中to_static语句。
+我们以wide_deep模型为例，在wide_deep模型的组网中，需要保存前向forward的部分,具体代码可参考[net.py](https://github.com/PaddlePaddle/PaddleRec/blob/master/models/rank/wide_deep/net.py)。其输入参数为26个离散特征组成的list，以及1个连续特征。离散特征的shape统一为（batchsize，1）类型为int64，连续特征的shape为（batchsize，13）类型为float32。
+所以我们在to_static脚本中的paddle.jit.to_static语句中指定input_spec如下所示。input_spec的详细用法：[InputSpec 功能介绍](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/04_dygraph_to_static/input_spec_cn.html)。
+```python
+# example dnn and wide_deep model forward
+dy_model = paddle.jit.to_static(dy_model,
+    input_spec=[[paddle.static.InputSpec(shape=[None, 1], dtype='int64') for jj in range(26)], paddle.static.InputSpec(shape=[None, 13], dtype='float32')])
+```
+4. 运行to_static脚本, 参数为您的yaml文件，即可保存成功。将您在yaml文件中指定的model_init_path路径下的参数，转换并保存到model_save_path/(infer_end_epoch-1)目录下。  
+注：infer_end_epoch-1是因为epoch从0开始计数，如运行3个epoch即0~2
+```bash
+python -u ../../../tools/to_static.py -m config.yaml
+```
+5. 我们在使用inference预测库预测时也需要根据输入和输出做出对应的调整。比如我们保存的模型为wide_deep模型的组网中，前向forward的部分。输入为26个离散特征组成的list以及1个连续特征，输出为prediction预测值。所以我们在使用inference预测库预测时也需要将输入和输出做出对应的调整。  
+将criteo_reader.py输入数据中的label部分去除：  
+```python
+# 无需改动部分不再赘述
+# 在最后输出的list中，去除第一个np.array，即label部分。
+  yield output_list[1:]
+```
+将inference预测得到的prediction预测值和数据集中的label对比，使用另外的脚本计算auc指标即可。
+
 ## 将保存的模型使用Inference预测库进行服务端部署
 paddlerec提供tools/paddle_infer.py脚本，供您方便的使用inference预测库高效的对模型进行预测。  
 
@@ -45,6 +78,9 @@ pip install GPUtil
 |       --reader_file        |    string    |       任意路径         |    是    |                          测试时用的Reader()所在python文件地址                            |
 |       --batchsize        |    int    |       >= 1         |    是    |                            批训练样本数量                            |
 |       --model_name        |    str    |       任意名字         |    否    |                            输出模型名字                            |
+|       --cpu_threads        |    int    |       >= 1         |    否    |                            在使用cpu时指定线程数，在使用gpu时此参数无效                            |
+|       --enable_mkldnn        |    bool    |       True/False         |    否    |                        在使用cpu时是否开启mkldnn加速，在使用gpu时此参数无效                        |
+|       --enable_tensorRT        |    bool    |       True/False         |    否    |                        在使用gpu时是否开启tensorRT加速，在使用cpu时此参数无效                        |
 
 2. 以wide_deep模型的demo数据为例，启动预测：
 ```bash
 
@@ -12,8 +12,8 @@ runner:
   ...
   # use inference save model
   use_inference: True  # 静态图训练时保存为inference model
-  save_inference_feed_varnames: ["label","C1","C2","C3","C4","C5","C6","C7","C8","C9","C10","C11","C12","C13","C14","C15","C16","C17","C18","C19","C20","C21","C22","C23","C24","C25","C26","dense_input"] # inference model 的feed参数的名字
-  save_inference_fetch_varnames: ["cast_0.tmp_0"] # inference model 的fetch参数的名字
+  save_inference_feed_varnames: ["C1","C2","C3","C4","C5","C6","C7","C8","C9","C10","C11","C12","C13","C14","C15","C16","C17","C18","C19","C20","C21","C22","C23","C24","C25","C26","dense_input"] # inference model 的feed参数的名字
+  save_inference_fetch_varnames: ["sigmoid_0.tmp_0"] # inference model 的fetch参数的名字
 ```
 3. 启动静态图训练
 ```bash
@@ -102,6 +102,15 @@ python ../../../tools/webserver.py gpu 9393
 # CPU
 python ../../../tools/webserver.py cpu 9393
 ```
+### 调整reader
+我们在服务端底层使用Inference预测库预测。和直接使用Inference预测库一样,需要在reader中将输入和输出做出对应的调整。比如我们保存的模型为wide_deep模型的组网中。输入为26个离散特征组成的list以及1个连续特征，输出为prediction预测值。  
+将criteo_reader.py输入数据中的label部分去除：  
+```python
+# 无需改动部分不再赘述
+# 在最后输出的list中，去除第一个np.array，即label部分。
+  yield output_list[1:]
+```
+将预测得到的prediction预测值和数据集中的label对比，使用另外的脚本计算auc指标即可。
 
 ## 测试部署的服务
 在服务器端启动serving服务成功后，部署客户端需要您打开新的终端页面。
 
@@ -0,0 +1,95 @@
+# MMOE
+
+ 以下是本例的简要目录结构及说明： 
+
+```
+├── data # 文档
+		├── train #训练数据
+			├── train_data.txt
+		├── test  #测试数据
+			├── test_data.txt
+├── __init__.py 
+├── README.md #文档
+├── config.yaml # sample数据配置
+├── config_bigdata.yaml # 全量数据配置
+├── census_reader.py # 数据读取程序
+├── net.py # 模型核心组网（动静统一）
+├── static_model.py # 构建静态图
+├── dygraph_model.py # 构建动态图
+```
+
+注：在阅读该示例前，建议您先了解以下内容：
+
+[paddlerec入门教程](https://github.com/PaddlePaddle/PaddleRec/blob/master/README.md)
+
+## 内容
+
+- [模型简介](#模型简介)
+- [数据准备](#数据准备)
+- [运行环境](#运行环境)
+- [快速开始](#快速开始)
+- [模型组网](#模型组网)
+- [效果复现](#效果复现)
+- [进阶使用](#进阶使用)
+- [FAQ](#FAQ)
+
+## 模型简介
+多任务模型通过学习不同任务的联系和差异，可提高每个任务的学习效率和质量。但在多任务场景中经常出现跷跷板现象，即有些任务表现良好，有些任务表现变差。  论文[《Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations》](https://dl.acm.org/doi/abs/10.1145/3383313.3412236 ) ，论文提出了Progressive Layered Extraction (简称PLE)，来解决多任务学习的跷跷板现象。 
+
+我们在Paddlepaddle定义PLE的网络结构，在开源数据集Census-income Data上验证模型效果。
+
+数据的格式如下：
+生成的格式以逗号为分割点
+```
+0,0,73,0,0,0,0,1700.09,0,0
+```
+
+## 运行环境
+PaddlePaddle>=2.0
+
+python 2.7/3.5/3.6/3.7
+
+os : windows/linux/macos 
+
+## 快速开始
+本文提供了样例数据可以供您快速体验，在任意目录下均可执行。在mmoe模型目录的快速执行命令如下： 
+```bash
+# 进入模型目录
+# cd models/multitask/ple # 在任意目录均可运行
+# 动态图训练
+python -u ../../../tools/trainer.py -m config.yaml # 全量数据运行config_bigdata.yaml 
+# 动态图预测
+python -u ../../../tools/infer.py -m config.yaml 
+
+# 静态图训练
+python -u ../../../tools/static_trainer.py -m config.yaml # 全量数据运行config_bigdata.yaml 
+# 静态图预测
+python -u ../../../tools/static_infer.py -m config.yaml 
+``` 
+
+## 模型组网
+
+### 效果复现
+为了方便使用者能够快速的跑通每一个模型，我们在每个模型下都提供了样例数据。如果需要复现readme中的效果,请按如下步骤依次操作即可。 
+在全量数据下模型的指标如下：
+| 模型 | auc_marital | batch_size | epoch_num | Time of each epoch |
+| :------| :------ | :------ | :------| :------ | 
+| PLE | 0.99 | 32 | 100 | 约1分钟 |
+
+1. 确认您当前所在目录为PaddleRec/models/multitask/ple  
+2. 进入paddlerec/datasets/census目录下，执行该脚本，会从国内源的服务器上下载我们预处理完成的census全量数据集，并解压到指定文件夹。
+``` bash
+cd ../../../datasets/census
+sh run.sh
+``` 
+3. 切回模型目录,执行命令运行全量数据
+```bash
+cd - # 切回模型目录
+# 动态图训练
+python -u ../../../tools/trainer.py -m config_bigdata.yaml # 全量数据运行config_bigdata.yaml 
+python -u ../../../tools/infer.py -m config_bigdata.yaml # 全量数据运行config_bigdata.yaml 
+```
+
+## 进阶使用
+  
+## FAQ
@@ -0,0 +1,13 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
@@ -0,0 +1,50 @@
+#   Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import print_function
+import numpy as np
+
+from paddle.io import IterableDataset
+
+
+class RecDataset(IterableDataset):
+    def __init__(self, file_list, config):
+        super(RecDataset, self).__init__()
+        self.file_list = file_list
+        self.config = config
+
+    def __iter__(self):
+        full_lines = []
+        self.data = []
+        for file in self.file_list:
+            with open(file, "r") as rf:
+                for l in rf:
+                    l = l.strip().split(',')
+                    l = list(map(float, l))
+                    label_income = []
+                    label_marital = []
+                    data = l[2:]
+                    if int(l[1]) == 0:
+                        label_income = [0]
+                    elif int(l[1]) == 1:
+                        label_income = [1]
+                    if int(l[0]) == 0:
+                        label_marital = [0]
+                    elif int(l[0]) == 1:
+                        label_marital = [1]
+                    output_list = []
+                    output_list.append(np.array(data).astype('float32'))
+                    output_list.append(np.array(label_income).astype('int64'))
+                    output_list.append(np.array(label_marital).astype('int64'))
+                    yield output_list
@@ -0,0 +1,43 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+runner:
+  train_data_dir: "data/train"
+  train_reader_path: "census_reader" # importlib format
+  use_gpu: False
+  use_auc: True
+  train_batch_size: 2
+  epochs: 3
+  print_interval: 2
+  #model_init_path: "output_model/0" # init model
+  model_save_path: "output_model_ple"
+  test_data_dir: "data/test"
+  infer_batch_size: 2
+  infer_reader_path: "census_reader" # importlib format
+  infer_load_path: "output_model_ple"
+  infer_start_epoch: 0
+  infer_end_epoch: 3
+
+hyper_parameters:
+  feature_size: 499
+  task_num: 2
+  shared_num: 2
+  exp_per_task: 3
+  level_number: 1
+  expert_size: 16
+  tower_size: 8
+  optimizer: 
+    class: adam
+    learning_rate: 0.001
+    strategy: async