Skip to content

Commit c4394bc

Browse files
committed
Merge remote-tracking branch 'ups/develop' into refine/infershape
2 parents 8a1abe5 + b681537 commit c4394bc

17 files changed

+324
-92
lines changed

doc/fluid/dev/releasing_process_cn.md

Lines changed: 21 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,23 @@
11
# PaddlePaddle发行规范
22

3-
PaddlePaddle使用git-flow branching model做分支管理,使用[Semantic Versioning](http://semver.org/)标准表示PaddlePaddle版本号。
3+
PaddlePaddle使用Trunk Based Development,使用[Semantic Versioning](http://semver.org/)标准表示PaddlePaddle版本号。
44

55
PaddlePaddle每次发新的版本,遵循以下流程:
66

77
1.`develop`分支派生出新的分支,分支名为`release/版本号`。例如,`release/0.10.0`
8-
1. 将新分支的版本打上tag,tag为`版本号rc.Patch号`。第一个tag为`0.10.0rc1`,第二个为`0.10.0rc2`,依次类推。
9-
1. 对这个版本的提交,做如下几个操作:
10-
* 使用Regression Test List作为检查列表,测试本次release的正确性。
11-
* 如果失败,记录下所有失败的例子,在这个`release/版本号`分支中,修复所有bug后,Patch号加一,到第二步
12-
* 修改`python/setup.py.in`中的版本信息,并将`istaged`字段设为`True`。
13-
* 将这个版本的python wheel包发布到pypi。
14-
* 更新Docker镜像(参考后面的操作细节)。
15-
1. 第三步完成后,将`release/版本号`分支合入master分支,将master分支的合入commit打上tag,tag为`版本号`。同时再将`master`分支合入`develop`分支。
16-
1. 协同完成Release Note的书写。
8+
2. 将新分支的版本打上tag,tag为`版本号rc-Patch号`。例如,第一个tag为`0.10.0-rc0`
9+
3. 新分支一般不接受新的feature和优化。QA在release分支上进行测试。研发基于最新的develop开发。
10+
4. QA和研发发现的bug,在develop上修复验证后,cherry-pick修复到release分支。直到release分支相对稳定。
11+
5. 如果有需要,在release分支最新代码上打上新的tag,比如`0.10.0-rc1`,让更多的用户加入测试。重复3-4步。
12+
6. release分支稳定后,打上正式的release tag,比如`0.10.0`
13+
7. 将这个版本的python wheel包发布到pypi。
14+
8. 更新Docker镜像(参考后面的操作细节)。
1715

1816
需要注意的是:
1917

20-
* `release/版本号`分支一旦建立,一般不允许再从`develop`分支合入`release/版本号`。这样保证`release/版本号`分支功能的封闭,方便测试人员测试PaddlePaddle的行为。
21-
*`release/版本号`分支存在的时候,如果有bugfix的行为,需要将bugfix的分支同时merge到`master`, `develop``release/版本号`这三个分支。
18+
* bug修复需要先在develop上进行,然后进入release分支。而不是直接在release分支上开发。
19+
20+
* release分支原则上只接受修复类的修改,不接受新feature。
2221

2322
## 发布wheel包到pypi
2423

@@ -61,24 +60,21 @@ docker push [镜像]:[version]
6160

6261
## PaddlePaddle 分支规范
6362

64-
PaddlePaddle开发过程使用[git-flow](http://nvie.com/posts/a-successful-git-branching-model/)分支规范,并适应github的特性做了一些区别。
65-
66-
* PaddlePaddle的主版本库遵循[git-flow](http://nvie.com/posts/a-successful-git-branching-model/)分支规范。其中:
67-
* `master`分支为稳定(stable branch)版本分支。每一个`master`分支的版本都是经过单元测试和回归测试的版本。
68-
* `develop`分支为开发(develop branch)版本分支。每一个`develop`分支的版本都经过单元测试,但并没有经过回归测试。
69-
* `release/版本号`分支为每一次Release时建立的临时分支。在这个阶段的代码正在经历回归测试。
63+
PaddlePaddle开发过程使用[Trunk Based Development](https://trunkbaseddevelopment.com/) 开发规范。
7064

71-
* 其他用户的fork版本库并不需要严格遵守[git-flow](http://nvie.com/posts/a-successful-git-branching-model/)分支规范,但所有fork的版本库的所有分支都相当于特性分支。
72-
* 建议,开发者fork的版本库使用`develop`分支同步主版本库的`develop`分支
73-
* 建议,开发者fork的版本库中,再基于`develop`版本fork出自己的功能分支。
74-
* 当功能分支开发完毕后,向PaddlePaddle的主版本库提交`Pull Reuqest`,进而进行代码评审。
75-
* 在评审过程中,开发者修改自己的代码,可以继续在自己的功能分支提交代码。
65+
* `develop`分支为开发(develop branch)版本分支。每一个`develop`分支的版本都经过单元测试。并且会经过模型回归测试。
66+
* `release/版本号`分支为每一次Release时建立的临时分支。release分支主要用于测试,bug修复和最终发版。
67+
* `master`分支因为历史原因,已经废弃。
7668

77-
* BugFix分支也是在开发者自己的fork版本库维护,与功能分支不同的是,BugFix分支需要分别给主版本库的`master``develop`与可能有的`release/版本号`分支,同时提起`Pull Request`
69+
* 其他开发者fork的feature branch。
70+
* 建议,开发者的feature branch需要同步主版本库的`develop`分支。
71+
* 建议,开发者的feature branch需要基于主版本库中的`develop`分支。
72+
* 当feature branch开发完毕后,向PaddlePaddle的主版本库提交`Pull Reuqest`,进而进行代码评审。
73+
* 在评审过程中,开发者修改自己的代码,可以继续在自己的feature branch提交代码。
7874

7975
## PaddlePaddle回归测试列表
8076

81-
本列表说明PaddlePaddle发版之前需要测试的功能点。
77+
TODO
8278

8379
### PaddlePaddle Book中所有章节
8480

doc/fluid/dev/releasing_process_en.md

Lines changed: 28 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -4,26 +4,21 @@ PaddlePaddle manages its branches using "git-flow branching model", and [Semanti
44

55
Each time we release a new PaddlePaddle version, we should follow the below steps:
66

7-
1. Fork a new branch from `develop` named `release/[version]`, e.g. `release/0.10.0`.
8-
1. Push a new tag on the release branch, the tag name should be like `[version]rc.patch`. The
9-
first tag should be `0.10.0rc1`, and the second should be `0.10.0.rc2` and so on.
10-
1. After that, we should do:
11-
* Run all regression test on the Regression Test List (see PaddlePaddle TeamCity CI), to confirm
12-
that this release has no major bugs.
13-
* If regression test fails, we must fix those bugs and create a new `release/[version]`
14-
branch from previous release branch.
15-
* Modify `python/setup.py.in`, change the version number and change `ISTAGED` to `True`.
16-
* Publish PaddlePaddle release wheel packages to pypi (see below instructions for detail).
17-
* Update the Docker images (see below instructions for detail).
18-
1. After above step, merge `release/[version]` branch to master and push a tag on the master commit,
19-
then merge `master` to `develop`.
20-
1. Update the Release Note.
21-
22-
***NOTE:***
23-
24-
* Do ***NOT*** merge commits from develop branch to release branches to keep the release branch contain
25-
features only for current release, so that we can test on that version.
26-
* If we want to fix bugs on release branches, we must merge the fix to master, develop and release branch.
7+
1. Create a new release branch from `develop`,named `release/[version]`. E.g.,`release/0.10.0`
8+
2. Create a new tag for the release branch, tag format: `version-rc.Patch`. E.g. the first tag is `0.10.0-rc0`
9+
3. New release branch normally doesn't accept new features or optimizations. QA will test on the release branch. Developer should develop based on `develop` branch.
10+
4. If QA or Developer find bugs. They should first fix and verify on `develop` branch. Then cherry-pick the fix to the release branch. Wait until the release branch is stable.
11+
5. If necessary, create a new tag on the relese branch, e.g. `0.10.0-rc1`. Involve more users to try it and repeat step 3-4.
12+
6. After release branch is stable,Create the official release tag,such as `0.10.0`.
13+
7. Release the python wheel package to pypi.
14+
8. Update the docker image (More details below).
15+
16+
NOTE:
17+
18+
* bug fix should happen on `develop` branch, then cherry-pick to relese branch. Avoid developing directly on release branch.
19+
20+
* release normally only accept bug fixes. Don't add new features.
21+
2722

2823
## Publish Wheel Packages to pypi
2924

@@ -97,26 +92,22 @@ You can then checkout the latest pushed tags at https://hub.docker.com/r/paddlep
9792
9893
## Branching Model
9994
100-
We use [git-flow](http://nvie.com/posts/a-successful-git-branching-model/) as our branching model,
101-
with some modifications:
102-
103-
* `master` branch is the stable branch. Each version on the master branch is tested and guaranteed.
104-
* `develop` branch is for development. Each commit on develop branch has passed CI unit test, but no
105-
regression tests are run.
106-
* `release/[version]` branch is used to publish each release. Latest release version branches have
107-
bugfix only for that version, but no feature updates.
108-
* Developer forks are not required to follow
109-
[git-flow](http://nvie.com/posts/a-successful-git-branching-model/)
110-
branching model, all forks is like a feature branch.
111-
* Advise: developer fork's develop branch is used to sync up with main repo's develop branch.
112-
* Advise: developer use it's fork's develop branch to for new branch to start developing.
113-
* Use that branch on developer's fork to create pull requests and start reviews.
114-
* developer can push new commits to that branch when the pull request is open.
115-
* Bug fixes are also started from developers forked repo. And, bug fixes branch can merge to
116-
`master`, `develop` and `releases`.
95+
PaddlePaddle uses [Trunk Based Development](https://trunkbaseddevelopment.com/) as our branching model.
96+
97+
* `develop` branch is used for development. Each comment to `develop` branc goes through unit tests and model regression tests.
98+
* `release/[version]` branch is used for each release. Release branch is used for tests, bug fix and evetual release.
99+
* `master` branch as been deprecated for historical reasons
100+
101+
* Developer's feature branch。
102+
* Developer's feature branch should sync with upstream `develop` branch.
103+
* Developer's feature branch should be forked from upstream `develop` branch.
104+
* After feature branch is ready, create a `Pull Request` against the Paddle repo and go through code review.
105+
* In the review process, develop modify codes and push to their own feature branch.
117106
118107
## PaddlePaddle Regression Test List
119108
109+
TODO
110+
120111
### All Chapters of PaddlePaddle Book
121112
122113
We need to guarantee that all the chapters of PaddlePaddle Book can run correctly. Including

paddle/fluid/API.spec

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ paddle.fluid.layers.gru_unit ArgSpec(args=['input', 'hidden', 'size', 'param_att
100100
paddle.fluid.layers.linear_chain_crf ArgSpec(args=['input', 'label', 'param_attr'], varargs=None, keywords=None, defaults=(None,))
101101
paddle.fluid.layers.crf_decoding ArgSpec(args=['input', 'param_attr', 'label'], varargs=None, keywords=None, defaults=(None,))
102102
paddle.fluid.layers.cos_sim ArgSpec(args=['X', 'Y'], varargs=None, keywords=None, defaults=None)
103-
paddle.fluid.layers.cross_entropy ArgSpec(args=['input', 'label', 'soft_label'], varargs=None, keywords=None, defaults=(False,))
103+
paddle.fluid.layers.cross_entropy ArgSpec(args=['input', 'label', 'soft_label', 'ignore_index'], varargs=None, keywords=None, defaults=(False, -100))
104104
paddle.fluid.layers.square_error_cost ArgSpec(args=['input', 'label'], varargs=None, keywords=None, defaults=None)
105105
paddle.fluid.layers.chunk_eval ArgSpec(args=['input', 'label', 'chunk_scheme', 'num_chunk_types', 'excluded_chunk_types'], varargs=None, keywords=None, defaults=(None,))
106106
paddle.fluid.layers.sequence_conv ArgSpec(args=['input', 'num_filters', 'filter_size', 'filter_stride', 'padding', 'bias_attr', 'param_attr', 'act'], varargs=None, keywords=None, defaults=(3, 1, None, None, None, None))
@@ -142,7 +142,7 @@ paddle.fluid.layers.beam_search ArgSpec(args=['pre_ids', 'pre_scores', 'ids', 's
142142
paddle.fluid.layers.row_conv ArgSpec(args=['input', 'future_context_size', 'param_attr', 'act'], varargs=None, keywords=None, defaults=(None, None))
143143
paddle.fluid.layers.multiplex ArgSpec(args=['inputs', 'index'], varargs=None, keywords=None, defaults=None)
144144
paddle.fluid.layers.layer_norm ArgSpec(args=['input', 'scale', 'shift', 'begin_norm_axis', 'epsilon', 'param_attr', 'bias_attr', 'act', 'name'], varargs=None, keywords=None, defaults=(True, True, 1, 1e-05, None, None, None, None))
145-
paddle.fluid.layers.softmax_with_cross_entropy ArgSpec(args=['logits', 'label', 'soft_label'], varargs=None, keywords=None, defaults=(False,))
145+
paddle.fluid.layers.softmax_with_cross_entropy ArgSpec(args=['logits', 'label', 'soft_label', 'ignore_index'], varargs=None, keywords=None, defaults=(False, -100))
146146
paddle.fluid.layers.smooth_l1 ArgSpec(args=['x', 'y', 'inside_weight', 'outside_weight', 'sigma'], varargs=None, keywords=None, defaults=(None, None, None))
147147
paddle.fluid.layers.one_hot ArgSpec(args=['input', 'depth'], varargs=None, keywords=None, defaults=None)
148148
paddle.fluid.layers.autoincreased_step_counter ArgSpec(args=['counter_name', 'begin', 'step'], varargs=None, keywords=None, defaults=(None, 1, 1))

paddle/fluid/operators/cross_entropy_op.cc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,11 @@ class CrossEntropyOpMaker : public framework::OpProtoAndCheckerMaker {
138138
"(bool, default false), a flag indicating whether to "
139139
"interpretate the given labels as soft labels.")
140140
.SetDefault(false);
141+
AddAttr<int>("ignore_index",
142+
"(int, default -100), Specifies a target value that is"
143+
"ignored and does not contribute to the input gradient."
144+
"Only valid if soft_label is set to False")
145+
.SetDefault(-100);
141146
AddComment(R"DOC(
142147
CrossEntropy Operator.
143148

paddle/fluid/operators/cross_entropy_op.h

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ class CrossEntropyOpKernel : public framework::OpKernel<T> {
4040

4141
math::CrossEntropyFunctor<DeviceContext, T>()(
4242
ctx.template device_context<DeviceContext>(), &y_2d, &x_2d, &labels_2d,
43-
ctx.Attr<bool>("soft_label"));
43+
ctx.Attr<bool>("soft_label"), ctx.Attr<int>("ignore_index"));
4444
}
4545
};
4646

@@ -74,16 +74,22 @@ class XeGradFunctor {
7474
const T* dy, // NOLINT
7575
const T* x, // NOLINT
7676
const int64_t* label, // NOLINT
77-
size_t num_classes)
78-
: dx_(dx), dy_(dy), x_(x), label_(label), num_classes_(num_classes) {}
77+
size_t num_classes, size_t ignore_index)
78+
: dx_(dx),
79+
dy_(dy),
80+
x_(x),
81+
label_(label),
82+
num_classes_(num_classes),
83+
ignore_index_(ignore_index) {}
7984

8085
HOSTDEVICE void operator()(size_t sample_id) {
8186
auto x_is_true_offset = sample_id * num_classes_ + label_[sample_id];
8287
for (size_t x_offset = sample_id * num_classes_;
8388
x_offset < (sample_id + 1) * num_classes_; ++x_offset) {
84-
dx_[x_offset] = x_offset != x_is_true_offset
85-
? static_cast<T>(0)
86-
: -dy_[sample_id] / x_[x_offset];
89+
dx_[x_offset] =
90+
(x_offset != x_is_true_offset || label_[sample_id] == ignore_index_)
91+
? static_cast<T>(0)
92+
: -dy_[sample_id] / x_[x_offset];
8793
}
8894
}
8995

@@ -93,6 +99,7 @@ class XeGradFunctor {
9399
const T* x_;
94100
const int64_t* label_;
95101
size_t num_classes_;
102+
size_t ignore_index_;
96103
};
97104

98105
template <typename DeviceContext, typename T>
@@ -109,6 +116,7 @@ class CrossEntropyGradientOpKernel : public framework::OpKernel<T> {
109116
// unnecessary to convert tensors to 2-D views.
110117
int rank = x->dims().size();
111118
int64_t class_num = x->dims()[rank - 1];
119+
int64_t ignore_index = ctx.Attr<int>("ignore_index");
112120
if (ctx.Attr<bool>("soft_label")) {
113121
XeSoftlabelGradFunctor<T> functor(dx_data, dy->data<T>(), x->data<T>(),
114122
label->data<T>(),
@@ -118,9 +126,9 @@ class CrossEntropyGradientOpKernel : public framework::OpKernel<T> {
118126
static_cast<size_t>(dx->numel()));
119127
for_range(functor);
120128
} else {
121-
XeGradFunctor<T> functor(dx_data, dy->data<T>(), x->data<T>(),
122-
label->data<int64_t>(),
123-
static_cast<size_t>(class_num));
129+
XeGradFunctor<T> functor(
130+
dx_data, dy->data<T>(), x->data<T>(), label->data<int64_t>(),
131+
static_cast<size_t>(class_num), static_cast<size_t>(ignore_index));
124132
platform::ForRange<DeviceContext> for_range(
125133
ctx.template device_context<DeviceContext>(),
126134
static_cast<size_t>(dy->numel()));

paddle/fluid/operators/math/cross_entropy.cc

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,8 @@ class CrossEntropyFunctor<platform::CPUDeviceContext, T> {
2828
public:
2929
void operator()(const platform::CPUDeviceContext& ctx, framework::Tensor* out,
3030
const framework::Tensor* prob,
31-
const framework::Tensor* labels, const bool softLabel) {
31+
const framework::Tensor* labels, const bool softLabel,
32+
const int ignore_index) {
3233
const int batch_size = prob->dims()[0];
3334
if (softLabel) {
3435
auto in = EigenMatrix<T>::From(*prob);
@@ -49,8 +50,12 @@ class CrossEntropyFunctor<platform::CPUDeviceContext, T> {
4950
int lbl = label_data[i];
5051
PADDLE_ENFORCE_GE(lbl, 0);
5152
PADDLE_ENFORCE_LT(lbl, class_num);
53+
PADDLE_ENFORCE((lbl >= 0 && lbl < class_num) || lbl == ignore_index);
5254
int index = i * class_num + lbl;
53-
loss_data[i] = -math::TolerableValue<T>()(std::log(prob_data[index]));
55+
loss_data[i] =
56+
lbl == ignore_index
57+
? 0
58+
: -math::TolerableValue<T>()(std::log(prob_data[index]));
5459
}
5560
}
5661
}

paddle/fluid/operators/math/cross_entropy.cu

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,14 @@ namespace math {
2323
namespace {
2424
template <typename T>
2525
__global__ void CrossEntropyKernel(T* Y, const T* X, const int64_t* label,
26-
const int N, const int D) {
26+
const int N, const int D,
27+
const int ignore_index) {
2728
for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < N;
2829
i += blockDim.x * gridDim.x) {
29-
PADDLE_ASSERT(label[i] >= 0 && label[i] < D);
30-
Y[i] = -math::TolerableValue<T>()(log(X[i * D + label[i]]));
30+
PADDLE_ASSERT(label[i] >= 0 && label[i] < D || label[i] == ignore_index);
31+
Y[i] = ignore_index == label[i]
32+
? 0
33+
: -math::TolerableValue<T>()(log(X[i * D + label[i]]));
3134
}
3235
}
3336

@@ -57,7 +60,8 @@ class CrossEntropyFunctor<platform::CUDADeviceContext, T> {
5760
public:
5861
void operator()(const platform::CUDADeviceContext& ctx,
5962
framework::Tensor* out, const framework::Tensor* prob,
60-
const framework::Tensor* labels, bool softLabel) {
63+
const framework::Tensor* labels, bool softLabel,
64+
const int ignore_index) {
6165
const T* prob_data = prob->data<T>();
6266
T* loss_data = out->mutable_data<T>(ctx.GetPlace());
6367

@@ -77,7 +81,8 @@ class CrossEntropyFunctor<platform::CUDADeviceContext, T> {
7781
int block = 512;
7882
int grid = (batch_size + block - 1) / block;
7983
CrossEntropyKernel<T><<<grid, block, 0, ctx.stream()>>>(
80-
loss_data, prob_data, label_data, batch_size, class_num);
84+
loss_data, prob_data, label_data, batch_size, class_num,
85+
ignore_index);
8186
}
8287
}
8388
};

paddle/fluid/operators/math/cross_entropy.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,8 @@ class CrossEntropyFunctor {
3838
public:
3939
void operator()(const DeviceContext& context, framework::Tensor* out,
4040
const framework::Tensor* prob,
41-
const framework::Tensor* labels, const bool softLabel);
41+
const framework::Tensor* labels, const bool softLabel,
42+
const int ignore_index);
4243
};
4344
} // namespace math
4445
} // namespace operators

paddle/fluid/operators/softmax_with_cross_entropy_op.cc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,12 @@ class SoftmaxWithCrossEntropyOpMaker
4444
"(bool, default: false), A flag to indicate whether to interpretate "
4545
"the given labels as soft labels.")
4646
.SetDefault(false);
47+
AddAttr<int>(
48+
"ignore_index",
49+
"(int, default -100), Specifies a target value that is ignored and"
50+
"does not contribute to the input gradient. Only valid if soft_label"
51+
"is set to False")
52+
.SetDefault(-100);
4753
AddComment(R"DOC(
4854
Softmax With Cross Entropy Operator.
4955

0 commit comments

Comments
 (0)