PaddlePaddle
diff --git a/‎README.md‎
Lines changed: 2 additions & 2 deletions b/‎README.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎models/multitask/ple/README.md‎
Lines changed: 6 additions & 11 deletions b/‎models/multitask/ple/README.md‎
Lines changed: 6 additions & 11 deletions
diff --git a/‎models/multitask/share_bottom/README.md‎
Lines changed: 97 additions & 0 deletions b/‎models/multitask/share_bottom/README.md‎
Lines changed: 97 additions & 0 deletions
diff --git a/‎models/multitask/share_bottom/__init__.py‎
Lines changed: 13 additions & 0 deletions b/‎models/multitask/share_bottom/__init__.py‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎models/multitask/share_bottom/census_reader.py‎
Lines changed: 50 additions & 0 deletions b/‎models/multitask/share_bottom/census_reader.py‎
Lines changed: 50 additions & 0 deletions
diff --git a/‎models/multitask/share_bottom/config.yaml‎
Lines changed: 40 additions & 0 deletions b/‎models/multitask/share_bottom/config.yaml‎
Lines changed: 40 additions & 0 deletions
diff --git a/‎models/multitask/share_bottom/config_bigdata.yaml‎
Lines changed: 41 additions & 0 deletions b/‎models/multitask/share_bottom/config_bigdata.yaml‎
Lines changed: 41 additions & 0 deletions
@@ -69,10 +69,10 @@
     |   排序   |                    [FGCNN](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/fgcnn/)                    |    ✓    |    ✓    |     ✓     |     ✓     | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [WWW 2019][Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction](https://arxiv.org/pdf/1904.04447.pdf)                                                                      |
     |   排序   |                  [Fibinet](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/fibinet/)                  |    ✓    |    ✓    |     ✓     |     ✓     | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [RecSys19][FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction]( https://arxiv.org/pdf/1905.09433.pdf)                                                 |
     |   排序   |                     [Flen](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/flen/)                     |    ✓    |    ✓    |     ✓     |     ✓     | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [2019][FLEN: Leveraging Field for Scalable CTR Prediction]( https://arxiv.org/pdf/1911.04690.pdf)                                                                                                           |
-    |  多任务  |                  PLE                   |    ✓    |    ✓    |     ✓     |     ✓     |  1.8.5 | [RecSys 2020][Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations](https://dl.acm.org/doi/abs/10.1145/3383313.3412236)                                                              |
+    |  多任务  |                  [PLE](models/multitask/ple)                   |    ✓    |    ✓    |     ✓     |     ✓     |  2.0 | [RecSys 2020][Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations](https://dl.acm.org/doi/abs/10.1145/3383313.3412236)                                                              |
     |  多任务  |                  [ESMM](models/multitask/esmm/)                   |    ✓    |    ✓    |     ✓     |     ✓     | 2.0 | [SIGIR 2018][Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate](https://arxiv.org/abs/1804.07931)                                                              |
     |  多任务  |                  [MMOE](models/multitask/mmoe/)                   |    ✓    |    ✓    |     ✓     |     ✓     | 2.0 | [KDD 2018][Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts](https://dl.acm.org/doi/abs/10.1145/3219819.3220007)                                                       |
-    |  多任务  |           [ShareBottom](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/multitask/share-bottom/)           |    ✓    |    ✓    |     ✓     |     ✓     | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [1998][Multitask learning](http://reports-archive.adm.cs.cmu.edu/anon/1997/CMU-CS-97-203.pdf)                                                                                                               |
+    |  多任务  |           [ShareBottom](models/multitask/share_bottom/)           |    ✓    |    ✓    |     ✓     |     ✓     | 2.0 | [1998][Multitask learning](http://reports-archive.adm.cs.cmu.edu/anon/1997/CMU-CS-97-203.pdf)                                                                                                               |
     |  重排序  |                [Listwise](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rerank/listwise/)                |    ✓    |    ✓    |     ✓     |     x     | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [2019][Sequential Evaluation and Generation Framework for Combinatorial Recommender System](https://arxiv.org/pdf/1902.00245.pdf)                                                                           |
 
 
 
@@ -34,10 +34,10 @@
 - [FAQ](#FAQ)
 
 ## 模型简介
-多任务模型通过学习不同任务的联系和差异，可提高每个任务的学习效率和质量。多任务学习的的框架广泛采用shared-bottom的结构，不同任务间共用底部的隐层。这种结构本质上可以减少过拟合的风险，但是效果上可能受到任务差异和数据分布带来的影响。  论文[《Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts》]( https://www.kdd.org/kdd2018/accepted-papers/view/modeling-task-relationships-in-multi-task-learning-with-multi-gate-mixture- )中提出了一个Multi-gate Mixture-of-Experts(MMOE)的多任务学习结构。
+多任务模型通过学习不同任务的联系和差异，可提高每个任务的学习效率和质量。但在多任务场景中经常出现跷跷板现象，即有些任务表现良好，有些任务表现变差。  论文[《Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations》](https://dl.acm.org/doi/abs/10.1145/3383313.3412236 ) ，论文提出了Progressive Layered Extraction (简称PLE)，来解决多任务学习的跷跷板现象。 
+
+我们在Paddlepaddle定义PLE的网络结构，在开源数据集Census-income Data上验证模型效果。
 
-## 数据准备
-我们在开源数据集Census-income Data上验证模型效果,在模型目录的data目录下为您准备了快速运行的示例数据，若需要使用全量数据可以参考下方[效果复现](#效果复现)部分.
 数据的格式如下：
 生成的格式以逗号为分割点
 ```
@@ -55,7 +55,7 @@ os : windows/linux/macos
 本文提供了样例数据可以供您快速体验，在任意目录下均可执行。在mmoe模型目录的快速执行命令如下： 
 ```bash
 # 进入模型目录
-# cd models/multitask/mmoe # 在任意目录均可运行
+# cd models/multitask/ple # 在任意目录均可运行
 # 动态图训练
 python -u ../../../tools/trainer.py -m config.yaml # 全量数据运行config_bigdata.yaml 
 # 动态图预测
@@ -68,20 +68,15 @@ python -u ../../../tools/static_infer.py -m config.yaml
 ``` 
 
 ## 模型组网
-MMOE模型刻画了任务相关性，基于共享表示来学习特定任务的函数，避免了明显增加参数的缺点。模型的主要组网结构如下：
-[MMoE](https://dl.acm.org/doi/abs/10.1145/3219819.3220007):
-<p align="center">
-<img align="center" src="../../../doc/imgs/mmoe.png">
-<p>
 
 ### 效果复现
 为了方便使用者能够快速的跑通每一个模型，我们在每个模型下都提供了样例数据。如果需要复现readme中的效果,请按如下步骤依次操作即可。 
 在全量数据下模型的指标如下：
 | 模型 | auc_marital | batch_size | epoch_num | Time of each epoch |
 | :------| :------ | :------ | :------| :------ | 
-| MMOE | 0.99 | 32 | 100 | 约1分钟 |
+| PLE | 0.99 | 32 | 100 | 约1分钟 |
 
-1. 确认您当前所在目录为PaddleRec/models/multitask/mmoe  
+1. 确认您当前所在目录为PaddleRec/models/multitask/ple  
 2. 进入paddlerec/datasets/census目录下，执行该脚本，会从国内源的服务器上下载我们预处理完成的census全量数据集，并解压到指定文件夹。
 ``` bash
 cd ../../../datasets/census
 
@@ -0,0 +1,97 @@
+# ShareBottom
+
+ 以下是本例的简要目录结构及说明： 
+
+```
+├── data # 文档
+		├── train #训练数据
+			├── train_data.txt
+		├── test  #测试数据
+			├── test_data.txt
+├── __init__.py 
+├── README.md #文档
+├── config.yaml # sample数据配置
+├── config_bigdata.yaml # 全量数据配置
+├── census_reader.py # 数据读取程序
+├── net.py # 模型核心组网（动静统一）
+├── static_model.py # 构建静态图
+├── dygraph_model.py # 构建动态图
+```
+
+注：在阅读该示例前，建议您先了解以下内容：
+
+[paddlerec入门教程](https://github.com/PaddlePaddle/PaddleRec/blob/master/README.md)
+
+## 内容
+
+- [模型简介](#模型简介)
+- [数据准备](#数据准备)
+- [运行环境](#运行环境)
+- [快速开始](#快速开始)
+- [模型组网](#模型组网)
+- [效果复现](#效果复现)
+- [进阶使用](#进阶使用)
+- [FAQ](#FAQ)
+
+## 模型简介
+share_bottom是多任务学习的基本框架，其特点是对于不同的任务，底层的参数和网络结构是共享的，这种结构的优点是极大地减少网络的参数数量的情况下也能很好地对多任务进行学习，但缺点也很明显，由于底层的参数和网络结构是完全共享的，因此对于相关性不高的两个任务会导致优化冲突，从而影响模型最终的结果。后续很多Neural-based的多任务模型都是基于share_bottom发展而来的，如MMOE等模型可以改进share_bottom在多任务之间相关性低导致模型效果差的缺点。
+
+我们在Paddlepaddle实现share_bottom网络结构，并在开源数据集Census-income Data上验证模型效果。
+
+## 数据准备
+我们在开源数据集Census-income Data上验证模型效果,在模型目录的data目录下为您准备了快速运行的示例数据，若需要使用全量数据可以参考下方[效果复现](#效果复现)部分.
+数据的格式如下：
+生成的格式以逗号为分割点
+```
+0,0,73,0,0,0,0,1700.09,0,0
+```
+
+## 运行环境
+PaddlePaddle>=2.0
+
+python 2.7/3.5/3.6/3.7
+
+os : windows/linux/macos 
+
+## 快速开始
+本文提供了样例数据可以供您快速体验，在任意目录下均可执行。在mmoe模型目录的快速执行命令如下： 
+```bash
+# 进入模型目录
+# cd models/multitask/share_bottom # 在任意目录均可运行
+# 动态图训练
+python -u ../../../tools/trainer.py -m config.yaml # 全量数据运行config_bigdata.yaml 
+# 动态图预测
+python -u ../../../tools/infer.py -m config.yaml 
+
+# 静态图训练
+python -u ../../../tools/static_trainer.py -m config.yaml # 全量数据运行config_bigdata.yaml 
+# 静态图预测
+python -u ../../../tools/static_infer.py -m config.yaml 
+``` 
+
+## 模型组网
+
+### 效果复现
+为了方便使用者能够快速的跑通每一个模型，我们在每个模型下都提供了样例数据。如果需要复现readme中的效果,请按如下步骤依次操作即可。 
+在全量数据下模型的指标如下：
+| 模型 | auc_marital | batch_size | epoch_num | Time of each epoch |
+| :------| :------ | :------ | :------| :------ | 
+| MMOE | 0.99 | 32 | 100 | 约1分钟 |
+
+1. 确认您当前所在目录为PaddleRec/models/multitask/share_bottom  
+2. 进入paddlerec/datasets/census目录下，执行该脚本，会从国内源的服务器上下载我们预处理完成的census全量数据集，并解压到指定文件夹。
+``` bash
+cd ../../../datasets/census
+sh run.sh
+``` 
+3. 切回模型目录,执行命令运行全量数据
+```bash
+cd - # 切回模型目录
+# 动态图训练
+python -u ../../../tools/trainer.py -m config_bigdata.yaml # 全量数据运行config_bigdata.yaml 
+python -u ../../../tools/infer.py -m config_bigdata.yaml # 全量数据运行config_bigdata.yaml 
+```
+
+## 进阶使用
+  
+## FAQ
@@ -0,0 +1,13 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
@@ -0,0 +1,50 @@
+#   Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import print_function
+import numpy as np
+
+from paddle.io import IterableDataset
+
+
+class RecDataset(IterableDataset):
+    def __init__(self, file_list, config):
+        super(RecDataset, self).__init__()
+        self.file_list = file_list
+        self.config = config
+
+    def __iter__(self):
+        full_lines = []
+        self.data = []
+        for file in self.file_list:
+            with open(file, "r") as rf:
+                for l in rf:
+                    l = l.strip().split(',')
+                    l = list(map(float, l))
+                    label_income = []
+                    label_marital = []
+                    data = l[2:]
+                    if int(l[1]) == 0:
+                        label_income = [0]
+                    elif int(l[1]) == 1:
+                        label_income = [1]
+                    if int(l[0]) == 0:
+                        label_marital = [0]
+                    elif int(l[0]) == 1:
+                        label_marital = [1]
+                    output_list = []
+                    output_list.append(np.array(data).astype('float32'))
+                    output_list.append(np.array(label_income).astype('int64'))
+                    output_list.append(np.array(label_marital).astype('int64'))
+                    yield output_list
@@ -0,0 +1,40 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+runner:
+  train_data_dir: "data/train"
+  train_reader_path: "census_reader" # importlib format
+  use_gpu: False
+  use_auc: True
+  train_batch_size: 2
+  epochs: 3
+  print_interval: 2
+  #model_init_path: "output_model/0" # init model
+  model_save_path: "output_model_share_btm"
+  test_data_dir: "data/test"
+  infer_batch_size: 2
+  infer_reader_path: "census_reader" # importlib format
+  infer_load_path: "output_model_share_btm"
+  infer_start_epoch: 0
+  infer_end_epoch: 3
+
+hyper_parameters:
+  feature_size: 499
+  bottom_size: 117
+  task_num: 2
+  tower_size: 8
+  optimizer: 
+    class: adam
+    learning_rate: 0.001
+    strategy: async
@@ -0,0 +1,41 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+runner:
+  train_data_dir: "../../../datasets/census/train_all"
+  train_reader_path: "census_reader" # importlib format
+  use_gpu: False
+  use_auc: True
+  train_batch_size: 32
+  epochs: 100
+  print_interval: 100
+  #model_init_path: "output_model/0" # init model
+  model_save_path: "output_model_share_btm_all"
+  test_data_dir: "../../../datasets/census/test_all"
+  infer_batch_size: 32
+  infer_reader_path: "census_reader" # importlib format
+  infer_load_path: "output_model_share_btm_all"
+  infer_start_epoch: 0
+  infer_end_epoch: 100
+
+
+hyper_parameters:
+  feature_size: 499
+  bottom_size: 117
+  task_num: 2
+  tower_size: 8
+  optimizer: 
+    class: adam
+    learning_rate: 0.001
+    strategy: async